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Preface 



Computer scientists create models of a perceived reality. Through AI techniques, 
these models aim at providing the basic support for emulating cognitive be- 
havior such as reasoning and learning, which is one of the main goals of the 
AI research effort. Such computer models are formed through the interaction 
of various acquisition and inference mechanisms: perception, concept learning, 
conceptual clustering, hypothesis testing, probabilistic inference, etc., and are 
represented using different paradigms tightly linked to the processes that use 
them. Among these paradigms let us cite: biological models (neural nets, genetic 
programming), logic-based models (first-order logic, modal logic, rule-based sys- 
tems), virtual reality models (object systems, agent systems), probabilistic mod- 
els (Bayesian nets, fuzzy logic), linguistic models (conceptual dependency graphs, 
language-based representations), etc. 

One of the strengths of the Conceptual Graph (CG) theory is its versatility in 
terms of the representation paradigms under which it falls. It can be viewed and 
therefore used, under different representation paradigms, which makes it a pop- 
ular choice for a wealth of applications. Its full coupling with different cognitive 
processes lead to the opening of the field toward related research communities 
such as the Description Logic, Formal Concept Analysis, and Computational 
Linguistic communities. We now see more and more research results from one 
community enrich the other, laying the foundations of common philosophical 
grounds from which a successful synergy can emerge. 

ICCS 2000 embodies this spirit of research collaboration. It presents a set 
of papers that we believe, by their exposure, will benefit the whole commu- 
nity. For instance, the technical program proposes tracks on Conceptual Ontolo- 
gies, Language, Formal Concept Analysis, Computational Aspects of Conceptual 
Structures, and Formal Semantics, with some papers on pragmatism and human 
related aspects of computing. Never before was the program of ICCS formed 
by so heterogeneously rooted theories of knowledge representation and use. We 
hope that this swirl of ideas will benefit you as much as it already has benefited 
us while putting together this program. 
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The Role of Conceptual Structure in Human Evolution 



Keith Devlin 

Center for the Study of Language and Information, Stanford University, 
Stanford, CA 94305, USA, 

and School of Science, Saint Mary's College of California, 
Moraga, CA 94575, USA 
devlin@csli . Stanford . edu 



Abstract. We generally think of conceptual structures (and their mathematical 
representations: conceptual graphs) as a technical framework providing a sound 
basis for research and design work in knowledge representation and related 
areas of computer science. In this article, I suggest that conceptual structure 
predates computer science by some three-million years. In particular, I argue 
that the human brain (or rather, the brain of our hominid ancestors) acquired 
conceptual structure long before it acquired language, and that the acquisition 
of conceptual structure was the key cognitive development that led to the 
emergence of contemporary humans. Moreover, human language was the result 
of the addition of grammatical structure to an already developed conceptual 
structure. 



1 Introduction 

This article is somewhat different from the usual fayre in the conceptual structures 
community, in that I step back from the field and view conceptual structure as a 
feature of the human mind. I suggest that it is a feature that arose naturally during the 
course of human evolution, and indeed, played a pivotal role in the acquisition of 
language. 

The thesis 1 present arose during the course of an investigation I was carrying out 
having a quite different purpose: to try to explain how the human brain acquired the 
ability for abstract, mathematical thought. The result of that project is described fully 
in Devlin (2000), and parts of this paper are adapted from that account. 

I hope that members of the conceptual structures community will find what I have 
to say of interest. Whether the considerations I outline might lead to any advances 
within conceptual structures research I cannot say. That is not my purpose. Rather, 
having been asked by the organizers of the conference to address the broader 
philosophical issues surrounding conceptual structures, I felt there would be interest 
in the kinds of considerations I present here, where we take a view of conceptual 
structures very different from the one with which we are familiar. 

Like any evolutionary account of the development of a cognitive capacity, my 
thesis is subject to debate in practically every particular. I believe my account to be 
coherent, and consistent with what evidence is available concerning human evolution. 
Personally, I also find it plausible. But there are plenty of competing theories of 
human language acquisition. A more comprehensive approach, that tries to consider 



B. Ganter and G.W. Mineau (Eds.): ICCS 2000, LNAI 1867, pp. 1-12, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




2 Keith Devlin 



and compare all possible interpretations of the evidence, would be far too long for this 
volume. Instead, I have adopted the approach — not uncommon in contemporary 
writing on evolution — of focusing on one particular thread. 



2 The Puzzle of Human Brain Size 

The human brain is one of nature’s oddities. It is nine times larger than is normal for a 
mammal of our body size, and thirty times larger than the brain of a dinosaur of the 
same body size. Its actual size varies between 1,000 and 2,000 cubic centimeters, with 
the vast majority between 1,400 and 1,500 cubic centimeters. It reached its present 
size about half a million years ago, following a growth period that lasted about three- 
and-a-half million years. In terms of relative brain size, our closest rivals are the 
dolphins and porpoises, and after them the non-human primates — apes, 
chimpanzees, and monkeys. But all lag well behind us in the brain-to-body-weight 
ratio. 

In addition to its highly unusual size, the human brain is an incredible energy hog. 
It makes up less than 2% of the body’s overall mass, yet uses about 20% of its energy. 
The size and complexity of our brain also carries a cost in terms of childcare. 
Because much of the development of the human brain takes place between birth and 
two years of age, a young human is very dependent on the care of others, and has to 
be looked after fairly extensively for several years. 

From an evolutionary perspective, the human brain is, then, somewhat of a puzzle. 
Natural selection is a hard taskmaster. A particular organ will only develop if it offers 
a definite survival advantage. Given the enormous, and quite atypical, cost of the 
human brain, both in terms of energy use and the need for extensive and prolonged 
parenting, the benefits conferred by the organ must be highly significant. What are 
those benefits? Or rather, what were they during the three-and-a-half million year 
timespan over which the human brain developed to its present size? Among the 
suggestions that have been proposed as possible answers to this question are: 
language, planning, tool use, logical thinking, color vision, accurate throwing, and 
body cooling. I shall argue that none of these really works, and that the most likely 
answer is the acquisition of conceptual structure. Part of my argument will be to show 
that the acquisition of conceptual structure is a necessary prerequisite for the 
acquisition of language and logical thinking ability. 

One caution is perhaps in order at this point. The survival advantage that drives a 
particular evolutionary change is not necessarily the primary advantage the organism 
subsequently gains from the change. Natural selection often finds a novel use for a 
feature that evolved for some quite different purpose — a process evolutionary 
biologists refer to as exaptation. For example, the wings of birds developed from the 
cooling fins of certain dinosaurs. Such examples demonstrate why the suggestion that 
the brain grew in size because of its cooling effect is not quite as fanciful as might 
first appear. The brain is, after all, a highly effective organ for lowering body 
temperature, which is why wearing a hat on a cold day has such a major effect on how 
cold we feel. It is just possible — although for reasons that will emerge in this paper, 
highly unlikely — that the present use of the human brain for advanced thinking is 
merely an exaptation of a cooling device. After all, if birds could think, we could 
imagine them reasoning along the lines: „Wings are so perfectly designed for flying. 
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and so crucial to our lives as birds, how could they possibly have started out as 
cooling devices?" 



3 The Advantages of a Large Brain 

Many authorities have speculated about the main driving force behind hominid brain 
growth. See, for example, Bickerton (1990, 1995), Gibson & Ingold (1993), Hurford, 
Studdert-Kennedy & Knight (1998), and Dunhar (1997). My purpose here is not to 
survey that extensive and fiercely disputed field of scholarship. Instead, I shall simply 
indicate why 1 think some of the more commonly advanced suggestions are not 
sufficient. My purpose in doing so is to set the scene for what I believe is the most 
likely explanation. 

In looking for the survival advantages that drove the enormous hominid brain 
growth for three-and-a-half million years, I am obviously not going to consider the 
advantages to the individual conferred by high intelligence in modern society — even 
if we assume that high intelligence requires a large brain.* Complex societies are a 
very recent development that came long after the human brain reached its present size 
and complexity. 

Several authorities have suggested that the use of tools led to brain growth. (See 
Gibson & Ingold 1993.) But when you consider that hy the end of the Erectus period, 
the brain size of the majority of Erectus members was already in the lower half of the 
modern human range, while their collection of tools was still relatively meager and 
crude, tool use can surely have been at most a contributing factor to brain growth. The 
real explosion in tool use came after the brain reached its present size. 

Likewise, although language uses large portions of the human hrain, the 
acquisition of language cannot have been the main driving force for brain growth. 
Language (i.e., full language, with syntax) was acquired at most 200,000 years ago, 
after the human hrain had reached its present size. Still, given the huge advantage to 
the species conferred hy language, it is tempting to argue that hrain growth could have 
been driven by some kind of precursor to language. Fine. But what was that precursor 
(or those precursors)? 

Color vision is another cognitive capacity that has been considered as a possible 
factor leading to large brains. It offers a number of survival advantages, including 
spotting colored fruit on bushes and trees, detecting potential predators, and seeing 
greater attractiveness in a potential mate. Moreover, the timing may be about right: 
color vision could have developed about three or four million years ago. However, 
other contemporary primate besides humans have color vision, and their brains are 
nothing like the size of ours, so it’s hard to see how the acquisition of color vision 



* Within the size range of the human brain there is no obvious correlation between brain size 
and intelligence. Some very intelligent people have brains around 1,000 cubic centimeters, 
and others who show no signs of what we would call high intelligence possess 2,000 cubic 
centimeter giants. The Neanderthals, who died out about 35,000 years ago, are popularly 
supposed to have been of low intelligence, and quite likely were without language, yet they 
had slightly larger brains than ours, with most falling in the range 1,500 to 1,750 cubic 
centimeters. 
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could have played more than a comparatively minor role in driving the growth of the 
hominid brain. 

William Calvin (1993) has suggested that accurate throwing could be the main 
force driving brain growth. Personally, I am not convinced that throwing requires a 
brain anything like the size of the human brain, and so I suspect that it played at most 
a small, contributory part in driving brain growth. 



4 The Path to Thinking 

Despite the many surprising examples of exaptation that exist, my own view is that 
brain growth was driven by precisely the thing you would expect when you look at 
the purposes to which we put our brains today. Namely, our ancestors developed 
larger brains primarily to enable them to think (and hence to act smarter, including 
formulating and following plans) and secondarily to communicate better. (I classify 
communication as secondary because communication of any idea presupposes a brain 
that can first generate that idea.) My task is to explain what „thinking“ consisted of 
during the brain-growth period and to what end our ancestors put their improved 
communicative abilities. 

In this connection, it is important not to be seduced by the popular present-day 
conception that the mind is a computing machine, which thinks by following a 
progression of discrete logical steps. As I argued in my book Goodbye Descartes 
(Devlin 1997), the assumption that all mental processes can be captured by logical 
rules is false. Moreover, it is precisely the falsity of that assumption that explains the 
failure of the many attempts to program digital computers to recognize scenes, handle 
natural language, and exhibit artificial intelligence. At the end of Devlin (1997), I 
proposed an alternative view of the human mind as a device for recognizing patterns 
— visual patterns, aural patterns, linguistic patterns, patterns of activities, patterns of 
behavior, logical patterns, and many others. Those patterns may be present in the 
world, or they may be imposed by the human mind as an integral part of its view of 
the world. (Some might argue that only patterns of the second kind are possible. My 
thesis is independent of that issue.) 

Some of the patterns we recognize can be described using language (including 
specialized languages such as musical notation for describing certain aural patterns 
and mathematical notation for describing various mathematical patterns). Other 
patterns seem to defy all explicit description. For example, we generally have little 
difficulty recognizing a friend or relative we have not seen for several decades. In all 
probability, all details of the individual’s face will have changed — indeed, our first 
reaction is to be surprised at how different they appear. And yet we recognize them 
nevertheless. Despite the many individual differences in their appearance, we know it 
is the same face. Recognition of a face seems to involve high level patterns that 
survive any number of individual changes. 

In Devlin (1997) I also suggested that human expertise results not from learning to 
follow rules well but from acquiring the ability to recognize (and possibly create) a 
great many patterns and react accordingly. Rules are useful for acquiring a new skill. 
Expertise comes about when the brain has adapted to that new skill — when it has 
learned to recognize the crucial patterns — types of object, circumstances, or 
contextual situation — and make the appropriate response automatically and 
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effortlessly. At that stage, rules are no longer needed. Driving a car is an obvious 
example. Practically any human being can, and many do, become expert in driving a 
car. When we start learning to drive, we follow rules explicitly — and it shows! With 
a little practice, we become more fluent. We still follow rules, not explicitly but 
efficiently. Problems may arise when we meet conditions we are not used to: we 
sometimes revert to complete beginner status, having to think explicitly what to do. 
Eventually, after we have been driving for some time, we become so attuned to the 
various patterns and responses involved in driving that we can usually drive without 
giving the matter any attention, and our response when faced with an unexpected 
situation is instinctive and immediate. We no longer need the rules that guided the 
learning process. 

Of course, human evolution can hardly have been guided by the development of 
the ability to drive a car. And yet we easily become highly skilled at driving. What 
evolution has provided us with is the ability to recognize new patterns and develop 
behavioral responses to those patterns. I shall build on this observation in order to 
develop an explanation of the growth of the hominid brain. But before I embark on 
that task, I need to introduce the linguists’ notion of protolanguage. 



5 Language and Protolanguage 

Language is much more than a system of communication. Communication systems 
are generally mixtures of sounds, gestures, facial expressions, skin coloration, and 
bodily movements that enable creatures to inform fellow creatures of their current 
emotions, current needs, current desires, pending action, impending danger, and the 
location of food supplies. Many species of animals have communication systems, 
often of great sophistication. 

According to the thesis I am advancing in this article, our pre-human, hominid 
ancestors developed a particularly expressive communication system that linguists 
refer to as protolanguage. A typical protolanguage utterance consists either of a 
single, referring word or a pair of words that express the fact that a particular object 
has a certain property. For example: „Danger!“ „Leopard!“ „Me hungry", 
„Mammoth dead", or „Give me!" Such a communication system is adequate for 
conveying basic predication information (i.e., information no more complex than 
„object X has property P") about any entity in the immediate environment or any 
object for which there is a referring word. No grammatical structure is required, other 
than the juxtaposition of two words. Word order is not important: „Hungry mother" 
and „mother hungry" mean the same, as do „Come mammoth" and „Mammoth 
come." The expressive power of protolanguage can be increased only by enlarging the 
vocabulary of referring words. The linguistic utterances of children less than two to 
three years of age are restricted to protolanguage. We adults often revert to 
protolanguage communication when we find ourselves in a foreign country where we 
do not speak the language, using a translation dictionary to determine the words we 
need but having no knowledge of how to combine those words to form grammatical 
sentences. 

Unlike protolanguage, a language has a combinatorial structure (i.e., syntax or 
grammar) that allows the expression of more complicated ideas. The rules of syntax 
tell you how to put words together to form meaningful expressions that convey 
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complex ideas. Because of this structure, language may convey ideas about structured 
circumstances and events outside the immediate environment. 

One obvious distinguishing feature of language is the presence of words that do not 
refer to anything in the physical world. Whereas the words of protolanguage all refer 
to something specific in the world — such as an object, a location, an action, or an 
emotional state — languages have a great many signs whose purpose is within 
langnage itself. English words such as if, then, because, unless, and, or, and every 
have a syntactic fnnction — their role is to provide the means of combining simple 
ideas into more complicated expressions. 

There is evidence that chimpanzees, monkeys, and apes can be taught 
protolanguage, although the instruction period generally takes several years and is 
limited to at most a few hundred „words.“ (The „words“ are usually pictures or 
symbols drawn on cards or presented on a touch-sensitive compnter screen.) Some of 
the most intriguing research on the use of protolanguage by apes is that of Sue 
Savage-Rumbaugh with her bonobo ape Kanzi. She has described the degree to which 
Kanzi has acquired protolanguage — using picture cards to communicate — in a 
nnmber of books and articles, most recently Savage-Rumbaugh, Shanker, and Taylor 
( 1998 ). 

It is my belief that the three-and-a-half million year period of hominid brain 
growth was accompanied by — and, in a sense I’ll make precise later, was driven by 
— a steady increase in protolanguage. During this growth period, our ancestors 
acquired more and more referring words, enabling them to (think and) communicate 
about ever more things and events. Then, between two-hnndred thousand and 
seventy-five thousand years ago, our Homo sapiens ancestors acquired syntax. 
Exactly how the transition from protolanguage to language occurred is a matter of 
some considerable debate in the lingnistic community. As I shall now show, there is 
good reason to believe that the change took place rapidly, rather than incrementally 
over a long period. 

First, when you look closely at grammatical structure — in particular the universal 
grammar that is shared by all hnman languages^ — it is clear that, with the exception 
of some specialized constructions such as the passive tense, which can be „added 
later,“ grammatical structure is an all-or-nothing thing. You simply can’t have „half a 
grammar." (The seeming complexity of grammar and the ability to generate sentences 
of greater and greater length comes from the fact that a few basic rules can be iterated 
indefinitely.) 

Second, if languages had evolved gradually, they would surely have evolved at 
different rates. In that case, we should see contemporary langnages of different 
grammatical complexity. But we don’t. Among all the world’s languages, not one has 
a simpler structure intermediate between protolanguage and full language. 

Third, the contemporary human brain goes from protolanguage to full, grammatical 
language in a single step. This is exactly how children acquire language. For the first 
two or three years of life, they acquire more and more words, which they nse one or 
two at a time in the form of protolanguage. Then, snddenly, they progress to full 
language. A similarly rapid transformation occurs with the emergence of creoles — 
new languages built on the vocabularies from two or more other languages. For 
example, during the nineteenth century laborers from around the world migrated to 
Hawaii to work on the sugar plantations. The original immigrants communicated with 



^ See, for example. Pinker (1994) or Devlin (1997). 
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one another using a protolanguage which took its words from the different languages 
of the new immigrants. The resulting protolanguage was what linguists refer to as a 
pidgin. The immigrants’ children, however, spontaneously spoke a grammatical 
language — a creole — that resulted from the addition of grammatical structure 
(syntax) to the vocabulary of the pidgin. (See Bickerton 1995.) 

What occasioned the emergence of language in our ancestors — the step from 
protolanguage to syntactic language? It seems unlikely that the primary cause was an 
urge/desire for richer communication. Language arose when the only form of 
communication was face-to-face. Yet, practically all face-to-face, day-to-day 
communication about the here-and-now can be achieved using simple combinations 
of two or three content words, together with gestures. Thus, a sufficiently rich 
protolanguage is all you require. You don’t need syntax — the grammatical structure 
of language. 

But if language did not emerge solely or primarily to facilitate better 
communication, what additional survival benefits did the acquisition of syntax 
confer? I suggest that the benefits of full language came only when human social life 
had grown sufficiently complex to allow coordinated planning. For example, a well- 
developed protolanguage would enable one Homo erectus to say to another something 
like „dead mammoth" and point toward the north, or to say the Erectus equivalent of 
„fruit“ while gesturing westward. And off they could trot in search of food. But 
protolanguage cannot convey a complex idea such as „There is a dead mammoth there 
[gestures to the north], but yesterday there were tigers in the vicinity, so if we get 
there and find they are still around, we should be prepared to move on over there 
[gestures to the west] where there is plenty of great fruit with little danger." For that 
kind of communication, you need the syntactic structure of language. But such 
communication is only necessary when the creatures doing the communicating can 
generate such complex thoughts. 

Thus, 1 believe that language did not evolve primarily to facilitate greater 
communication. Rather, it arose, almost by chance, as a by-product of our ancestors 
acquiring the ability for an ever richer understanding of the world in which they 
found themselves — both the physical environment and their increasingly complex 
social world. The key development, I suggest, was „off-line thinking" — the capacity 
to reason in an abstract, „what if?" fashion. (Off-line thinking is to be distinguished 
from on-line thinking — sometime called stimulus-response activity — which is 
occasioned by sensory inputs from the environment and results in an appropriate 
responsive action. Neither the idea of off-line thinking nor the name is original with 
me. I discuss the concept at length in Devlin 2000, showing that the capacity for off- 
line thinking is in fact equivalent to having syntax.) 



6 Brain Types 

Here is my theory as to what drove the growth in the human brain. Our hominid 
ancestors were easily outclassed by other species in size, strength, and speed. They 
did not have the obvious advantages of lethally sharp claws or large, powerful jaws 
with long, razor-sharp teeth. Nor did they have thick skins or indeed any other 
anatomical defense against being ripped apart and eaten. They were masters of no 
particular habitat and survived only by remaining highly adaptable, moving from 
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place to place and from terrain to terrain in search of whatever food they could find 
— fruit, nuts, leaves, eggs, and occasionally meat from animals. Success in this 
lifestyle depended on a rich view of the world — the ability to recognize a large 
collection of patterns. The greater and the richer our ancestors’ understanding of the 
world, the greater their chances of survival — by foreseeing, and then staying away 
from, danger and by figuring out where the next meal would come from and acting 
accordingly. 

An obvious instance where it would have been advantageous to recognize a larger 
collection of patterns would be in tracking. An injured rhinoceros might still fend off 
an Erectus band who happened to come across it. But if the band could track the 
animal over several days, taking advantage of clues such as footprints, broken 
branches, and animal droppings, then they could follow the rhino at a safe distance 
until it was sufficiently weakened that they could move in for the kill. The greater the 
number of patterns the hunters could differentiate, the more clues they could take 
advantage of, and the better they would be able to track their prey. The human ability 
to track prey, often over great distance for several days, is not shared by apes or 
chimpanzees, so it could well have been one of the factors that drove early hominid 
brain growth. If our ancestors could also exchange information about the hunt using 
some form of utterances, they could coordinate their efforts, and so increase their 
chances of success. 

In other words, our ancestors were highly adaptable, nomadic scavengers who 
lived by their wits. Survival depended on their being smarter than any other species, 
and most likely by being able to communicate much finer information than any other 
species. They developed an intense curiosity about their environment and about all 
other species, a desire — indeed, an instinct — to understand and explain (at least to 
themselves) whatever they saw. This survival feature, unique to humans (at least in 
extent), is something we recognize today, even in young babies. 

Of course, a great deal of what we nowadays think of as „acting smart“ involves 
trains of thought that presuppose syntax — what I referred to earlier as „off-line“ 
thought. But syntax emerged at the end of the three-and-a-half million year period of 
brain growth? What exactly did acting smart amount to before then? Let’s start out by 
looking at some simpler examples. 

The sunflower constantly turns its head to face the sun. Since it requires the sun’s 
rays to live and grow, its behavior could be said to be „acting smart" from „the 
sunflower’s perspective." The sunflower’s turning is somewhat analogous to our 
heading for the refrigerator when we are hungry. But most of us would be very 
reluctant to talk about a sunflower even having a perspective, let alone describing its 
movements as „acting smart." 

There are simple water-based bacteria which will move toward nutrients in the 
water and will swim away from regions that contain chemicals poisonous to them. 
Although this is a fairly simple, low level, chemical response, it serves the survival 
needs of the bacterium well enough. Still, it is not what we generally mean by „acting 
smart." 

Stomphia coccinea is a species of sea anemone whose ocean environment typically 
contains eleven types of starfish, only two of which prey on the anemone. If one of 
the nine non-threatening species of starfish happens to brush against a Stomphia, the 
anemone does not react. If one of the two predator starfish touches a Stomphia, 
however, the anemone recoils immediately. As with the bacteria, this is purely an 
automatic response to a particular chemical or collection of chemicals found in the 
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two predator species but not in the other nine. But the Stomphia’s ability to 
distinguish between two categories of starfish — dangerous and not — is a cognitive 
ability that helps to keep it alive. Are we fully justified in maintaining that it is not 
„acting smart"? 

Moving on to creatures with brains, when rooks see an animal coming too near 
their nest, they will pick up stones and drop them on the invader to drive the threat 
away. Is that „acting smart"? A leaf cutting ant that encounters an opening too narrow 
for the leaf it is carrying will maneuver the leaf until it can pass through the opening. 
Does that qualify as „acting smart"? How about the octopus which figured out how to 
unscrew a mason jar in order to get to food inside? Acting smart or not? 

What all these examples have in common is that the organism or creature in 
question produces an action that is appropriate for a given situation. There is no need 
for a „thought process" mediating between a stimulus from the environment and the 
production of the response; it can be quite automatic. The key in terms of ensuring the 
organism’s continued survival is that there be a link between a type of stimulus and a 
type of responsive action. To make this observation a bit more precise, we need to 
analyze the notion of a „type" (of object, environment, action, etc.).^ 

Type recognition is the key to life. The bacterium in water that moves toward 
dissolved nutrients and away from poisonous water can be said to „recognize" the 
types nutritious and poisonous („good“ and „bad" if you like). The Stomphia 
anemone is able to distinguish between two kinds of starfish; dangerous and non 
threatening. (In human scientific terms, the Stomphia’s single „dangerous" category 
contains two types of starfish, its single „non threatening" category contains nine. 
Eleven types for humans, two for Stomphia. What constitutes a type is very much in 
the eye of the beholder.) These two examples illustrate that it does not take a high 
intellect in order to „recognize" a type. Indeed, it does not even require a brain. All 
that is required is that the creature modify its behavior in a systematic way according 
to the type. 

The cognitive requirement for the conscious acquisition of types is the ability to 
recognize similarities and differences: to realize that some things are similar — they 
are of the same type — and other pairs of things are different — they are not of the 
same type. But type-driven activity does not require cognition. For example, a simple 
thermostat can „recognize" the types warm and cold — by which 1 mean that it reacts 
appropriately to the two types. Of course, one important difference between a 
thermostat and an organism is that the types a living creature recognizes matter to it. 
Even for an organism as simple as a bacterium, the nutritious/poisonous distinction is 
a matter of life and death. But nothing matters to an inanimate type-recognizer. 

In addition to the types nutritious and poisonous, which apply to ingestible 
substances, most animals recognize the types friend and foe, which apply to other 
animals. Adequate recognition of these types is again often a matter of life and death. 
In modern society, medical doctors are people who have learned to recognize a great 
many types dealing with conditions of the human body: the type of having a cold, the 
type of having influenza, the type of being HIV positive, the type of having a high 
cholesterol level, the type of being overweight, blood type, and so on. Indeed, much 
of the training doctors have to complete prior to being allowed to practice involves 
learning to recognize a great many such types, and learning to link each type of 
medical condition to a suitable type of treatment. 



^ A detailed analysis of types is provided in Devlin (1991). 
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One evolutionary path that many creatures have followed — including ourselves 
— is to increase the number of types they recognize and respond to. Such species 
„progress“ by successive generations responding better to various types than did their 
ancestors, and by differentiating new types (as the environment changes), which their 
ancestors did not differentiate. 

Much of the time, such evolutionary developments comprise the acquisition (by a 
new generation) of new automatic stimulus-response links. These require no 
conscious effort or even cognitive activity of any kind. Humans, however, often do 
make a conscious effort to increase our repertoire of types. This can be at the level of 
a particular individual, such as a trainee doctor, or it can be at the species level, as in 
medical research. Much medical research amounts to increasing the number of types 
of body conditions that can be recognized, and expanding the collection of types of 
treatment that can be effectively applied. This may involve refining existing types of 
splitting them into subtypes. Or we may discover that what were once thought to be 
separate types are really subtypes of the same type — a new common thread is 
observed. 

Clearly, distinguishing types is the very essence of life, or at least of staying alive. 
Type recognition is so important that in many animals, large parts of one particular 
organ have evolved to handle types — to recognize types and generate responses of 
appropriate types. That organ is the brain. Simple brains will recognize one or more 
types and in each case produce a bodily response of the appropriate type. Bigger 
brains have a greater repertoire of types, both types they can recognize and response 
types they can generate. 



7 Conceptual Structure 

I suggest that the massive growth of the hominid brain was driven largely by the 
acquisition of more and more types, giving our ancestors an increasingly rich view of 
the world (i.e., allowing them to differentiate more and more regularities in the 
world). 

Some time between 75,000 and 200,000 years ago, after our ancestors’ brain had 
reached its present size, it underwent a structural change (for what else could it have 
been?) that gave us syntax. I have already indicated that this change occurred rapidly. 
Since evolutionary changes are usually minor and incremental, what that implies, 
presumably, is that a certain threshold level of brain complexity was reached, so that a 
structural change that was in itself minor resulted in a ma]ov functional change in the 
brain’s representational, analytic, and communicative capacities. Is there anything 
else, in addition to a rich collection of types (equivalently, a large protolanguage 
vocabulary) that would have been necessary for such a threshold change to take 
place? 

I suggest there is: A complex structure that connects types together. Without that, I 
believe, the jump from protolanguage to language would have been far too great to 
have occurred in a short space of time and be the result of a minor structural change in 
the brain. Given a sufficiently rich repertoire of types and an associated structural 
network, however, it is possible to posit simple mechanisms whereby a single, small 
structural change in the brain could have given rise to syntax. One such, which I 
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describe in full in Devlin (2000), is that the brain exaptated the already existing neural 
circuits for generating syllables from phonemes. 

Let me give you some examples of the kind of type-structure that is necessary for 
the emergence of syntax. We recognize the type of all people. We also recognize 
various sub-types: male humans and female humans, children and adults, people with 
college degrees farmers, the types of Americans, English, Germans, Mexicans, and so 
forth. 

In addition to some types being subtypes (or refinements) of others, types form a 
complex, interrelated web. For instance, there are male, adult, Mexicans and female, 
American farmers with college degrees. We make implicit use of this web whenever 
we communicate with one another; it is part of the way we understand the world. 

I believe that the development of such a type structure was a key step toward 
language. It may well be where our ancestral line parted company with the other 
primates in terms of cognitive development. Certainly, no evidence indicates that any 
nonhuman animals that have been taught protolanguage have such a structure. They 
may have a few types that apply to things in the world. But there is no evidence to 
suggest that they are aware of a framework of connections between those categories. 

In contrast, I suggest that in order for Homo e rectus’s enormous brain to have 
provided sufficient survival advantage to outweigh the cost of supporting it, the 
acquisition of a rich collection of types must have been accompanied by the 
acquisition of a type structure — that is, their world view comprised not only types 
but connections between them. 

Of course, types connected by structural relations in this manner is exactly what we 
mean by conceptual structure (and represent mathematically by conceptual graphs). In 
other words, I conclude that the principal driving force behind the three-and-a-half 
million year growth of the human brain, leading to the acquisition of language 
between 75,000 and 200,000 years ago, was the acquisition and development of 
conceptual structure in the pre-human hominid brain. Moreover, human language 
resulted from the addition of syntax on top of the conceptual structure that had 
already been acquired. 
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Abstract. This paper deals with different views of lexical semantics. 
The focus is on the relationship between lexical expressions and concep- 
tual components. First the assumptions about lexicalization and decom- 
positionality of concepts shared by the most semanticists are presented, 
followed by a discussion of the differences between two-level-semants and 
one-level-semantics. The final part is concentrated on the interpretation 
of conceptual components in situations of communication. 



1 The Classical View 

Following the classical scholastic view linguistic signs are related to two types of 
entitites: 

(a) the type of cognitive entitites, concepts 

(b) the type of entities of the external world 

There is a direct relation between signs and concepts, between concepts and 
entities of the external world, and there is an indirect relation between signs 
and entities of the external world being mediated by concepts. These relations 
have been represented by the well known semiotic triangel (first in Ogden and 
Richards, 1953; cf. Lyons, 1977, 96.): 



cognitive entity: concept 
B 




C 

external world: referent 



This representation is in accordance with the traditional analysis of signifi- 
cation as expressed in the famous scholastic maxim: 
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’’Voces significant res mediantibus conceptibus” 

(Words signify things by means of mediating concepts) 

As Lyons has pointed out there is considerable disagreement about the details 
of the triadic analysis of signification (cf. Lyons, 1977, 99): 

— should A be defined as a physical or a mental entity? 

— what is the psychological or ontological status of B? 

— is C something that is referred to by uttering the sign? 

(If this would be so, how can signs while they are not used signify some- 
thing?) 

— is C the totality of things that might be referred to by uttering the sign? 

— is C some typical or ideal representative of this class? 

In modern semantics there are two ways of answering those questions: 

(1) the post Saussurean view: A and B are both psychological (or mental) enti- 
tites. They constitute the linguistic sign als having two aspects: the aspect of 
the signifiant, image accoustique, i.e. the phonological form, and the aspect 
of the signifie, the concept. The meaning of a linguistic sign is then, following 
Saussure, composed of the intrinsic relation between signifiant and signifie 
and the meaning relations the sign holds to all the other signs of a given la 
nguage, the valeur.This Saussurean conception has been modified by modern 
semanticists in the following way: Concepts (which are not clearly defined 
by Saussure) are considered as abstract and collective entities in contrast to 
individual mental images, ideas or thoughts. They are relatively stable (in 
contrast to most psychological views) and highly structured. The principles 
of structuring concepts are part of the human cognitive endowment, they are 
innate. Linguistic expressions encode concepts as their semantic content cut 
out of the conceptual pool which is universal, i.e. independent of any existing 
language. Because of this twofold relationship of linguistic expressions and 
concepts as semantic content and universal conceptual structure this kind 
of semantics has also been called two-level-semantics. It has been worked 
out by Bierwisch, Lang, Wunderlich, Schwarz. 

(2) A second issue in modern semantics is the view that linguistic forms are 
immediately related to concepts without any intermediate level of semantic 
content. It has been worked out by Jackendoff, Lakoff, Fauconnier, Lan- 
gacker. 

In spite of the difference between the two kinds of semantics there are some 
common assumptions concerning the relation between linguistic expressions and 
concepts: 

First of all, it is a common assumption that there are linguistic expressions 
which don’t encode any concept at all, as for instance pronouns, interjections, 
the single words of an idiom or a formula like good bye or hello. 

A further common assumption is that some concepts have no corresponding 
word, and can be encoded only by a phrase. Speakers of French don’t have a 
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word for the concept expressed by engl. sibling or german Geschwister, but may 
nonetheless have the concept of ’’sibling” characterized as child of same parents, 
and object of many beliefs and expectations, a concept which has frere and soeur 
as subcategories. We don’t have a word for ’’wheeled vehicle” or ’’bad person”. 
The question if 1 exical gaps are generally or in a certain language accidental or 
systematically distributed is an open question. There seems to be some evidence 
that lexical gaps are generally high-leveled in the hierarchy of concepts and 
constitute generic terms for a range of lower leveled concepts and words, (cf. 
Fellbaum 1996.) 

The third common assumption concerns the overall phenomenon that words 
can - in an actual use - encode a whole range of concepts. Suppose Mary says to 
Peter: 

(1) Open the bottle 

In most situations, she would be understood as asking him to uncork or uncap 
the bottle depending on the properties of the referent of the direct object: thus 
opening a corked bottle means uncorking it, and so on. Uncorking a bottle may 
be the standard way of opening it, but another way is to saw off the bottom, and 
on some occasion, this might be what Mary was aking Peter to do. Or, suppose 
Mary says to Peter: 

(2) Open the washing machine 

In most situations, she will probably be asking him to open the lid of the 
machine. However, if Peter is a plumber, she might be asking him to unscrew the 
back; in other situations, she might be asking him to blow the maschine open, 
or whatever. 

The general point is that expressions like open can be used to convey indef- 
initely many concepts. It is impossible for all of these to be listed in a lexicon. 
Nor can they be generated by taking only the linguistic context, particulary the 
direct object, into account. It seems reasonable to conclude that a word like 
open is often used to convey a concept that is encoded neither by the word itself 
nor by the verb phrase open X. (Cf. for similar examples: Searle 1980; Pinkal 
1995; Pustejovsk y 1995; Sperber/ Wilson 1998.) The common claim is that lex- 
ical meaning is flexible, but not structureless, and from this follows that lexical 
meanings can be assigned a description of its structure in terms of features or 
meaning postulates. 

The claim of decompositionality of the conceptual content of words is far 
from being without problems: Johnson-Laird (1987) tested the human faculty 
of giving definitions of word meanings by non experts. He choose four levels of 
semantic complexity, and predicted that semantically complex verbs, such as 
watch and lend, would be easier to define than the semantically simplest verb, 
such as see and own. This prediction was confirmed: It was easy for the subjects 
to break down the meani ng of a complex verb into simpler components for 
which there are corresponding words, but it was hard for the subjects to find 
such components for a simple verb. 
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Some semanticists therefore assume that there is a set of semantic primitives 
that cannot be further analyzed, such as ’’cause”, ’’bring about”, ’’vision”, ’’hu- 
man”, ’’thing”, ’’place” and so on. They are the basic features for building up 
more complex word meanings. (Cf. already Katz/Fodor 1963; Wierzbicka 1972; 
Bierwisch 1992; Lang 1994.) But even if we admit that there is only a subset 
of word meanings which can be analyzed in terms of features there are some 
problems left: The problems concern the organizatio n of the features as well as 
their logical status. The features ’’animal” and ’’concrete object” are analytical 
implications of the meanings of cat and car respectively, they have the status of 
meaning postulates unter the condition that the words are used in a standard 
non-marked context. Meaning postulates may vary from language to language 
(cf. Schwarze, 1987.) The German preposition om/ analytically implies ”in con- 
tact with (x,y)” whereas the french preposition sur does not analytically imply 
”in contact with (x,y)”, cf.: 

(3) L’ avion plane sur la ville 

*Das Flugzeug schwebt auf der Stadt 

Another group of word meanings can well be described by components, but 
they have another organization and logical status than meaning postulates. This 
is the case with words like elephant, tiger, lemon, water and so on, so called natu- 
ral kind terms. Most speakers of English are able to say what an elephant is, they 
have seen an elephant in the zoo, or a picture of one, and they know something 
about the nature of the animal. Yet the term is a theoretical one (cf. Johnson- 
Laird 1987, 203.) It designate s a set of creatures within our categorization of 
animals. Our knowledge of such matters is far from complete. We don’t know 
for certain what the essentials of elephanthood, tigerhood or cathood actually 
are. These words notoriously give rise to the problem of delimiting what should 
be said in a dictionary and what should be said in an encyclopedia, because it 
is doubtful whether there are any necessary and sufficient conditions for defin- 
ing them. (cf. Putnam 1975; Lutzeier 1985; Schwarze 1987; Harras 1991. ) If 
someone tells me that he saw an elephant in the zoo, then I will interpret her 
utterance to mean that she saw a large four-legged mammal with tusks and a 
trunk. But these characteristics are not essential, and they are not mere induc- 
tions, since to check them inductively presupposes some independent method 
for first identifying elephants; in fact they are part of our ’’theory” of elephants 
(cf. Johnson-Laird 1987; Putnam 1975.) which tells us that the stereo- or pro- 
totypical member of the class has each of these attributes. The lexical meaning 
of elephant must therefore be represented by a schema of the stereo- or pro- 
totypical animal, a mental model with a set of default values, that is, specific 
values for variables that can be assumed in the absence of information to the 
contrary. Default values have a special status concerning their contribution to 
the truth conditions of the sentence where the word occurs. Necessary compo- 
nents of a word’s meaning support valid inferences, default values hold onl y in 
the case that nothing is asserted to the contrary. Stereotypes as the content of 
the meaning of natural kind words may vary from culture to culture, they are 
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not necessarily dependent on a certain language, (cf. Putnam 1975; Schwarze 
1987.) The same is true of words for artefacts like knife, hammer, plate, table 
and so on. 

After we have known the common assumptions of most of modern semantic 
theories let’s now have a look at their differences. I will first discuss the so called 
two-level-semantics. 

2 Two Conceputal Domains: The Two-Level-Semantics 

Concepts appear within this theory in two domains: 

— in the domain of semantic form as the conceptual content of a lexical ex- 
pression 

— in the domain of conceptual structure in terms of which the actual interpre- 
tation of a given linguistic expression is specified. 

The domain of semantic form is related to the language-dependent repre- 
sentation of a conceptual structure, the conceptual structure is related to the 
universal representation of encyclopedic background knowledge, contextual in- 
formation and situational conditions. The semantic form of a lexical expression 
constitutes its core meaning, that is, the context-free meaning as stored in long 
term memory. The domain of conceptual structure is needed for the interpreta- 
tion of a given lexical expression in a certain context and situation. The focus 
of two-level-semantics is upon the representational aspect of meaning as well as 
on the dynamic procedural aspect of information processing. This kind of se- 
mantics is therefore claimed to be a part of cognitive science and the cognitive 
information processing system does not necessarily have to be a human being. 

The distinction between semantic form and conceptual structure is mainly 
motivated by the overall phenomenon of the underdetermination of linguistic 
expressions. Well known examples are the following (cf. Bierwisch/Schreuder 
1992; Schwarz 1992): 

(1) John left the institute an hour ago 

(2) John left the institute a year ago 

Linguistically, we know that John is a proper name by means of which we may 
refer to male persons identified by this name. This is the information represented 
in the semantic form of John. Conceptually, we have a specific knowledge about 
each person named John we happen to know. This knowledge has nothing to do 
with the knowledge of English. 

In (1) the institute most likely refers to a building and leave is interpreted as 
a change of place, while in (2) the institute refers to an institution and leave is 
interpreted as a change of profession. The different time intervals in (1) and (2) 
bring in different background knowledge not contained in any of the expressions 
in (1) and (2). 

Another case of linguistic underdetermination is the following: 



(3) The office is closed 




18 



Gisela Harras 



If (3) is uttered by a servant of the office the office door may be open, while 
uttered in another situation the door may be locked. In all these cases the in- 
terpretation of a given utterance is first based on the semantic form of the 
expressions which activates the relevant background knowledge of conceptual 
structures. The semantic form is only a part of the final interpretation, or more 
precisely: the conceptual interpretation of a given semantic form would have 
to contain this semantic form as a proper substructure, or at least the weaker 
condition must hold: the semantic form has to be embeddable into the concep- 
tual interpretation, with embedding to be conceived as the relation of a partial 
model to a more complete model the partial model is compatible with. (cf. Bier- 
wisch/Schreuder 1992; Kamp/Reyle 1994.) 

The strucure of a full lexical entry contains four kinds of lexical information 
(cf. Bierwisch/Lang 1987; Bierwisch/Schreuder 1993; Lang 1994): 

— the phonetic form represented by phonological features 

— the grammatical features, syntactic category, finiteness 

— the semantic form represented by categorized variables x,y,P, formal com- 
ponents such as C, C, — >•, and material components such as CAUSE, 

BECOME, PLACE. 

— the argument structure of verbs and some nouns. 

The natural domain of two-level-semantics are simple and complex verbs, 
dimension adjectives and prepositions (cf. Wunderlich, 1997). Their semantic 
form is much more structured than the semantic form of nouns, especially words 
for natural kinds or artefacts like cat, camel, lemon or chair, knife and car. The 
conceptual analysis of these words may be infinite and the object of all kinds 
of contingent knowledge. The semantic form of words for natural kinds is poor; 
it contains only the info rmation NATURAL KIND, ANIMAL; their conceptual 
structure may be extremely rich. The semantic form of artefacts contains the 
information of specific functions that allows for inferences about the relation 
between their referents and human activities. In accordance to this assumption 
the representations of the semantic form of cat and camel on the one side and 
chair and knife on the other are the same in all relevant aspects. The differences 
between cats and camels and between chairs a nd knifes are captured in the 
conceptual structure. This strategy implies a problematic division of labour, that 
is: linguistics is responsible for the formal reduction, while psychology, philosophy 
or whatever world science is responsible for the material instanciation. However, 
besides this very general problem, there are at least two serious problems left, one 
of which concerns rather methodological difficulties, whereas the other relates to 
more fundamental questions about information processing: 

— the first problem concerns the distinction between assigning semantic and 
conceptual information to a given lexical expression. As Lang (1994, 28) 
has pointed out, the fact that the conceptual structure is inaccessible to 
direct observation may leed to a thoughtless overgeneralization of semantic 
properties of a given language. This may be true even if semanticists respect 
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the caveat principle of cross-linguistic and intermodal comparing, (cf. Lang 
1994; Meyer 1992; KrifFka/Wenger 1990; Dolling 19 92.) 

— the second problem concerns the question of how human beings are capable 
to convey and understand the relevant information of a given linguistic ex- 
pression, if most of their utterances are semantically underdetermined. I will 
turn to this problem in the last part of my paper. 



3 One Domain of Concepts: One-Level-Semantics 
(Jackendoff) 

The fundamental assumption of Jackendoff (1983; 1990; 1997.) is that the gram- 
matical structure of natural languages offers an important new source of evidence 
for (the theory of) cognition. The grammatical structure of a natural language 
is regarded as a triple, consisting of phonological structures, syntactic structures 
and conceptual structures. Phonological, syntactic and conceptual structures are 
determined by phonological, syntactic and conceptual formation rules. The three 
structures are connected by corresponding rules (cf. Jackendoff 1997, 39.): 

phonological syntactic conceptual 

formation formation formation 

rules rules rules 



phonological 
structures (PS) 



syntactic 
structures (SS) 



conceptual 
structures (CS) 





PS-SS 

corresponding 

rules 



SS-CS 

corresponding 

rules 



Phonological and syntactic structures are modules of their own, whereas con- 
ceptual structures must be linked to all the other sensory modalities. In contrast 
to two-level-semantics Jackendoff (and others like Fauconnier 1985; Lakoff 1987; 
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Langacker 1988; Pustejovsky 1995.) does not postulate a level of semantic form 
as an interface between grammar and concepts: ’’...there is not a form of mental 
representation devoted to a strictly semantic level of word meanings, distinct 
from the level at which linguistic and nonlinguistic information are compatible. 
This means that if, as is often claimed, a distinction exists between dictionary 
and encyclopedia lexical information, it is not a distinction of level; these kinds 
of information are cut from the same cloth.” (Jackendoff 1983, 110.) 

This position has at least two consequences: 

— if semantics takes into account that lexical meanings are highly context de- 
pendent then a one-level conception has to match all (possible) contexts or 
phrase structures to a given lexical item. 

— There must be some rules on the level of the conceptual structure allowing for 
re-interpretations of a lexical expression in a certain context and situation. 
The first claim is fulfilled in so far as the rules of corresponding syntactic 

and conceptual structures are applied to syntactic structures of different com- 
plexity: as (the simple verb), as (verb -I- direct object), (verb -I- direct 
object -I- indirect object), (verb -I- direct object -I- indirect object -I- prepo- 
sitional object) and so on. So we get different (complex) lexical entries for give, 
give something, give something to someone or give something to someone on 
some occasion, (cf. Jackendoff 1990) The aim of Jackendoff (and other one-level- 
semanticists) obviously is not to build up a genuine lexicon of a given language, 
but rather to show the above mentioned quality of natural language as a source 
of evidence of human cognition. 

The second claim is fulfilled by introducing preference rules for stereo- or 
prototype concepts such as BIRD, FRUIT or VEGETABLE. Preference rules 
are like to default values: They mark the best example in a subcategory of a 
higher category, so as for instance ROBIN as the best example of the category 
BIRD. 

Preference rules are not only applied to nouns constituting words for natural 
kinds or artefacts, they are also applied to verbal concepts (cf. Jackendoff 1983, 
150): 

(1) I must have looked at that a dozen times, but I never saw it 

(2) I must have seen that a dozen times, but I never noticed it 

(1) and (2) raise a serious problem if we assume that see has a unified single 
meaning: The meaning of see in (1) is used to deny its meaning in (2). ”x seesi 
y” means something like ’x’s gaze goes to y’. In this sense of seei the direct 
object may alternate with prepositonal phrases: 

(3) John saw into the kitchen 

(4) John saw under the chair 

(4) may mean that John’s gaze terminated at a certain point under the chair 
or that his gaze passed under the chair to a point beyond. So seci is very similar 
to a verb of motion: 

(5) John saw the flying saucer from his living room 
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In (1) is asserted that I never became aware of the object; so we get see 2 
with the meaning ’y comes to x’s visual awareness’. It is precisely this awareness 
that is not necessary for the assertion of see in (2). A sentence like 

(6) John sees Mary 

is not in the same way ambiguous as We went to the bank : The speaker does not 
have one reading or the other in mind. (6) appears to intend both that John’s 
gaze went to Mary and that Mary enters John’s awareness, but the presence 
of both ’x’s gaze goes to y’ and ’y comes to x’s awareness’ is not a necessary 
condition for the use of the verb see, they are stereotypical devices, captured by 
preference rules. 

The one-level-semantics of Jackendoff and others obviously does not give rise 
to the problem of the distinction between what is a language-dependent concept 
and what is a language-independent concept, a problem two-level-semantics is 
very much concerned with. However, the other problem of how people are capable 
of conveying and understanding the relevant meaning in semantically underde- 
termined utterances is left. I will turn to this problem next. 

4 Meaning, Understanding and Human Communication 

Communication seems to be an extremely ambiguous expression in English. In 
all cases of its occurrence it has obviously something to do with human interpre- 
tation: we may interpret all kinds of images, signs, natural states and behaviour 
of persons as something conveying meaning. This can be illustrated by the fol- 
lowing example (cf. Carston 1999): Imagine observing a scene in which a man 
lowers himself, head and arms first, down into a hole in the ground while another 
man holes onto his legs, swiveling his eyes leftwards in our direction and jerking 
his head quite violenty from left to right. Very few observers will represent this 
scene to themselves as I have described it and leave it at that; most of us will 
try to find some plausible beliefs, desires and/or intentions that we can attribute 
to these two men, some set of mental states which will explain their behaviour. 
We’ll take the head movement of the second man to be, not some involuntary 
tic he developed upon seeing us, but rather a movement desig ned to make it 
evident to us that he wants our intention and has something to tell us. We might 
even infer what the intended message is, something like ”I want you to help me” 
perhaps. 

The second man’s behaviour, his swiveling his eyes and jerking his head, 
counts for us as an act of communication under the following conditions: 

— the man’s behaviour is intentional, it is an action; 

— the intention is directed towards us 

— the intention contains a message, so that: the man intended us to recognize 
his intention to communicate to us that he wants some help. 

These conditions of human communication have been worked out in detail by 
Grice (Grice 1989). Following him, an act of communication can be defined als 
follows: A speaker S communicates something to a hearer H, iff 
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(i) S intends H to react in a certain way r 

(ii) S intends H to recognize (i) 

(iii) S intends H to react in the way r by means of recognition of (i) 

Communication is thus regarded as a means of influencing people by using signs 
in order to bring about a certain reaction of people presupposing that the recog- 
nition of the intention of bringing about the reaction will be a sufficient reason 
for people to react in the intended way. An important point is that influencing 
people counts only as a case of communication, if it is intended to be recognized. 
In a given situation an imperative ”go downstairs” may not be very different 
from a kick. Both kinds of inf luencing may bring about the same reaction of an 
adressee, that is: to go downstairs, but only the imperative is a case of commu- 
nication in the Gricean sense (cf. Keller 1995). 

The Gricean conception of communication is not necessarily restricted to lin- 
guistic communication. What are then the special conditions of linguistic com- 
munication? First of all we would say that the content of the act of communi- 
cation, the utterance, has to be understood in the intended way. Understanding 
the speaker’s utterance is necessary for the hearer to react in the intended way. 
If someone says to me: 

(1) Gould you pass me the salt 

I first have to understand that the speaker uttered a request in order to react in 
the intended way, that is: to pass him the salt. 

The definition of Gricean communication seems to suggest that the hearer 
has the role of a pure recipient of speaker’s intention. But, following Grice, com- 
munication demands not only intentions on the side of the speaker but also 
cooperation on both sides. Gommunication is a cooperative enterprise governed 
by a general cooperative principle and some more specific maxims. The coopera- 
tive principle is about the presupposed appropriateness of a given conversation, 
i.e. a case of linguistic communication: ’’Make your conversational contribution 
such as is required, at the stage at which it occurs, by the accepted purpose 
of direction of the talk exchange in which you are engaged.” (Grice 1989, 26.) 
The specific maxims concern the quantity of information (be informative), the 
quality of information (be sincere), the relevance of information (be relevant) 
and the modality of information (be clear) . 

The general cooperative principle and the maxims of communication seem to 
be trivial on the first account. But Grice himself and legions of linguists consider 
this principle and the maxims to be a powerful instrument for explaining how 
people convey and understand relevant information. Let’s take a very innocent 
example: Someone saying 

(2) The kettle is black 

thereby means that the outside of the iron kettle is covered with dark brown 
grease stains (example from Travis 1997). The addressee who can see the kettle 
would certainly not say that (2) is false, even if he has evidence that, taken 
(2) literally, it is actually false! In interpreting what the speaker intends him to 
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understand, namely that the kettle is now darker than it has been some time 
before, the hearer must rely on the following assumptions: 

— S and I are in the same situation of communication 

— S has communicated to me that p (the kettle is black) 

— p is not true in the situation we are in 

— S is respecting the general principle of cooperation 

— S is respecting the maxim of quality, i.e. he does not want to tell me some- 
thing false 

— given the situation we are in and the fact that S is cooperative, S intends me 
to unterstand that q (the kettle is now darker than it has been some time 
before). 

This is the rough mechanism of what Grice has called a conversational implica- 
ture in contrast to implications which are due to the semantics of the expressions 
in a given utterance. It is quite clear that implicatures rely on different kinds of 
information: 

— the information provided by the content of the speaker’s utterance, i.e. what 
is linguistically said; 

— the information provided by the special situation speaker and hearer are in; 

— the information provided by the situation type of communicating. 

These kinds of information have to be completed by a fourth one for cases in 
which the state asserted by the speaker’s utterance is not - as in our example - a 
part of the situation speaker and hearer are in. This kind of information is pro- 
vided by all kinds of background knowledge about how things are or have to be. 
Finally - the most crucial condition - the knowledge constituting all these types 
of information - or at least a part of it - has to be available for both speaker and 
hearer, i.e. it has to be common knowledge. Most of the traditional definitions 
have to struggle with an infinite regress by explaining common knowledge in 
terms of individual knowledge, like: 

A knows X 
B knows X 

A knows that B knows X 
B knows that A knows X 
A knows that B knows that 
B knows that A knows that 
A knows that B knows that 
B knows that A knows that 
and so on: ad infinitum 

It is quite obvious that, when people make use of common knowledge, they 
do not pursue an infinite regress of this kind. The analysis in terms of invidual 
knowledge is not at all appropriate to give a notion of real common knowledge. 



A knows X 
B knows X 

A knows that B knows X 
B knows that A knows X 
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Devlin (1997) has proposed an explanation in terms of situation theory: The 
act of two persons A and B having common knowledge of a fact, event or cir- 
cumstance X constitutes a situation s. The situation s is a common knowledge 
situation for A and B to have common knowledge of X, iff: 

(i) s is a situation supporting the individual knowledge of A and B: 
s I <CA knows X;^ A <CB knows X;^ 

(ii) both A and B know that s, the situation they are in, is one of common 
knowledge: 

s I <cA knows X;^ A <CB knows X;^ A <CA knows pl^ A <CB knows pl^ 

where p is the very proposition about A’s and B’s knowing p (cf. Devlin 1997, 
255.). The analysis is circular and selfreferential. Its advantage consists in the 
principal garantee that everything in the common knowledge situation available 
for A is to the same extent available for B and vice versa. Consequently, with 
regard to the Gricean framework, the speaker ist not more privileged in meaning 
something than his addressee is in understanding what the speaker meant. 

My purpose of this part of my paper actually is to give an account of how 
underdetermined utterances or parts of them can be interpreted by a hearer 
in accordance to the speaker’s meaning. Till now, we have only considered the 
case of the black kettle, a case where the utterance had to be re-interpreted. 
What’s about the open-, office- or institute-cases mentioned in part 1 and 2 
of my paper? In these cases the hearer has to determine what concept of all 
possible concepts is the one meant by th e speaker. Does he really have to work 
out the speaker’s meaning by a Gricean implicature, as it seems plausible for 
the black kettle case (as well as for other cases like metaphorical uses, rhetorical 
questions, tautologies and irony)? Sure, if we assume that we have stored all the 
possible concepts for open in our mental lexicon and that we have clear devices 
for accessing them, the Gricean mechanism would be unnecessary. But this is 
not the case: the possible concepts for open (and legions of other words like cut, 
write, fly, eat, begin, end or compute) are indefinite, and we don’t have singular 
situations in mind as pointers to linguistic expressions: what we have stored in 
long term memory are situation types related to types of words, if any. 

What special kind of implicature would be efficient to determine the concep- 
tual content of open in a given context and situation? A speaker utters 

(3) The plumber opened the washing machine yesterday, but he couldn’t find 
any defect 

meaning that the plumber unscrewed the machine. Are the conceptual compo- 
nents of the utterance sufficient for the hearer to understand that the plumber 
unscrewed the machine? Surely not, the plumber could just have opened the lid 
of the machine and looked into it or he could have done both, opened the lid 
and unscrewed the machine. So, the hearer has the choice between ’unscrewing 
the machine” , ’opening the lid’ and ’opening the lid and unscrewing the ma- 
chine’. Is it our knowledge about the world that determines meaning in context? 
If so, what kind of re al, normal or expected world should this be? I think there 
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is no way out of relying only on linguistic context and individual background 
knowledge. We first have to account for the hearer’s presumption of the speaker’s 
cooperative acting, especially his being relevant. Assuming that the speaker in- 
tends to communicate relevant information, the hearer’s background knowledge 
presupposed by him to be the speaker’s and the hearer’s common knowledge is 
activated, and this allows him to recognize the speaker’s int ended meaning, that 
the plumber has unscrewed the washing machine. 

Though this is a very weak implicature, it clearly illustrates that even in 
rather unspectacular cases of interpretation, pragmatic aspects related to act- 
ing, play a fundamental role. The lexicon of a language, conceptual knowledge 
and communicative acting are inseperately interrelated within human cognitive 
systems unlike artificial ones. 
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Abstract. Conceptual authoring support provides tools to help authors 
construct and organize their document on the conceptual level. As com- 
puter-based tools are purely formal entities, they cannot handle natural 
language itself. Instead, they provide the author with directions and ex- 
amples that (if adopted) remain linked to the text. This paper discusses 
several levels of such directions: A Pattern describes a solution for a com- 
mon problem, here a combination of audience and topic. It may point 
to several Schemata, which may be expanded in the document structure 
graph, leaving the author with more specific graph structures to expand 
and text gaps to fill in. A Type Definition is finally a restriction on the 
possible document structures the author is allowed to build. Several ex- 
amples of such patterns, schemata and types are presented. 

These levels of support are being implemented in an authoring support 
environment called Chasid. It extends conventional authoring applica- 
tions, currently ToolBook. The graph transformation aspects are imple- 
mented as an executable PROGRES specification. 



1 What Conceptual Authoring Support Is Supposed 
to Be Good for 

Most readers of this text will have been confronted with the task of writing 
larger complex texts. A Master’s Thesis, as a once-in-a-lifetime task, is accepted 
as consuming a lot of time. However, when articles, proposals, reports or even 
books must be written on tight schedules, more efficient techniques have to be 
adopted. The reader, on the other hand, has come to expect more from the 
structure of a text. It is annoying to page through an electronic PostScript or 
even PDF document to trace cross-references. Ideally, the document should not 
only have one linear reading path, but provide just the necessary information 
needed to satisfy the reader’s interests. 

These observations are not fundamentally new. However, solutions to these 
problems are rare. Writing courses help in formulating ideas and in maintaining 
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an overview of the text as a whole. There is also advice on reading, addressing 
the extraction of the crucial context (1), or simply increasing the speed of the 
sentence recognition process. As in other application areas, usage of automated 
help is a viable alternative to education: not everybody who needs the skills has 
the time to attain them. A less fatalistic argument is that the computer is better 
at avoiding the stupid mistakes humans tend to make when having too much on 
their mind and too little time. 

Computers have proven useful in capturing and typesetting text. However, 
the support currently available in practice and research remains very basic. Word 
processors and text editors allow manipulation of characters, maintaining a uni- 
fied layout and up-to-date cross-references. This does not address the horror 
vacui writers traditionally face: the empty sheet has been replaced by an empty 
screen. To alleviate this, we propose a system of patterns that map problems 
to solution strategies and schemata that provide a structure of topics to be 
addressed and their relationships. 

The structure of this article follows the Problem/Solution presentation 
SCHEMA as shown in Figure 7. The “Problems” section (2) looks closer at the 
writing and reading problems and identifies the ones most interesting here. The 
“Solution” is spread across three sections: the first (Section 3) describes the basis 
of the solution, namely authoring using conceptual graph structures. Concrete 
functionality is presented in Section 4, with the implementation details deferred 
to Section 5. The concepts presented here are related to other work in the oblig- 
atory “Related Work” section (6). Finally, conclusions and plans (Section 7) 
recapitulate the main thesis and identify the problems we intend to address 
next. Throughout the discussion, this paper itself is used as a running example. 



2 The Problems: Writing and Reading 

The author has to solve several problems before writing can begin: Initially, there 
must be a topic chosen. The author must understand that topic sufficiently, and 
she must have references or evidence to support the claims made in the context 
of that topic. Even though these problems vary with the kind of document to 
be written, they constitute the basis on which authoring can take place (14). To 
a certain degree, these problems can still be solved iteratively during writing. 
But, as a prerequisite for writing, they are not addressed here. 

When setting out to write, some external requirements are usually also de- 
fined. The size of the document may or may not be restricted, and the audience 
(in terms of knowledge and interests) should also be discernible. The author’s 
basic problem is then to convey a topic in a given amount of space, building 
on some required knowledge and serving some interest. 

The author has to select the points that must be made from everything that 
can be said about the topic. She must devise a structure of the document and 
fill the structure with text and other media. 

After a certain amount of document has been written, the editing loop usually 
begins. The author reads and evaluates her text; adding, deleting or moving por- 
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tions. In this process, she must ensure consistency and reasonable redundancy- 
freedom. This loop may also include internal or external review. 

Finally, the document has to conform to formal quality standards of ortho- 
graphy, grammar, typography, layout, and reproduction. If the document is to 
be published in some way, the editor and publisher invests time to technically 
unify documents, especially submissions to a conference or books of a series. The 
publisher extracts bibliographical information for Library of Congress archiving 
(or other national equivalences). Finally, the publication is prepared for mass 
printing and distribution. 

When the document arrives at the reader, he is usually interested only in 
parts of it. These parts may be the central claims, some hard data, instructions 
for certain procedures or solutions to more-or-less concrete problems. So, the 
reader’s basic problem is to find the relevant sections of the text quickly and 
grasp their meaning. This process may iterate as further reading turns out to be 
necessary. 

As the author could not anticipate or address the specific reading interest, 
the reader has to make do with tables of contents, indices and headings. If the 
reader seeks this author’s usage of some some terms or methods, however, this 
guidance is of little use. Since construction of indices is costly and uses printing 
space, articles usually come without any and books only with a selected few. 

Our work concentrates on supporting the author so that conceptual informa- 
tion that is useful for publishers and readers is gathered as a side-effect. In this 
paper, we address the authoring problems of selecting the points to be made and 
structuring the text. 



3 The Basis: Authoring Knowledge in Conceptual 
Structures 

The proposed solution starts by capturing authoring knowledge in patterns, 
schemata and types that can be used in high-level (concept-level) authoring 
tools. This entails adding the conceptual structure explicitly to the document. 

The enhanced document contains the presentation (the core conventional 
document), the presentation structure (a description of its immediate struc- 
ture, extending the expressiveness of the conventional authoring application) and 
the content model (a model of the concepts described in the document and 
their relationships). Initially, there is an empty presentation, the presentation 
structure consists of one generic top-level structure concept [Presentation] 
and the content model of one generic concept [Content]. 



3.1 Patterns Are Top-Level Guidance 

Patterns have been characterized in (2): “Each pattern describes a problem 
which occurs over and over again in our environment, and then describes the 
core of the solution to that problem, in such a way that you can use this solution 
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a million times over, without ever doing it the same way twice.” ^ Apart from 
the descriptions of problem and solution, a pattern consists of a unique name 
and a discussion of consequences of its use. 

In our context, we use patterns to help the author reach clarity on the au- 
dience and topic. By browsing through a catalog of patterns, she is invited to 
consider alternatives for creating the document, based on her understanding of 
the audience and the topic. These are the keys for matching a problem, where 
the solution is the value mapped to the keys. Once the tool comes with a larger 
body of patterns, they may be categorized so the author can consult the catalog 
in a dialog- like manner. 

As there is surprisingly little advice on this level in the literature, the patterns 
presented here are tentative proposals. There are also hardly any usage examples, 
as required by (5) . This is a violation of the principle that patterns are supposed 
to express proven solutions, for which the only remedy is further study. 

Goal Approximation (Pattern 1) is the ideal research community setting. 
There is a combined effort of research groups communicating with publications 
to achieve a common goal. As a consequence, the goal can just be referred to 
and does not have to be discussed in detail. On the other hand, requirements on 
the solution itself are higher. Relationship Contribution (Pattern 2), tries to 
initiate a dialog between research fields. This is a consequence of the atomiza- 
tion of research activity based on presumptions determining potential tools and 
methodologies. Papers of this kind run a risk of rejection on the grounds of be- 
ing not related and must therefore pay special attention to making the problem 
interesting. 

Pattern 1. Goal Approximation 

Problem Describe advances towards reaching a goal accepted by the audience. 
Solution In a Problem-Solution schema, state the goal briefly, referring to 
the literature for further discussion. Focus on the open question you are 
about to answer (Open-Problem schema). Describe your findings (empir- 
ical data, technical implementation, proof) in appropriate patterns (Data- 
SoLUTiON, Technical-Solution, Proof-Solution^. Connect these End- 
ings to the open problem, restating it appropriately (Partial-Answer or 
Full-Answer^. If applicable, discuss other consequences of your Endings, 
such as ruling out other overall answer attempts. 

Discussion The aim of the publication is to be a noticeable step forward. As 
a prerequisite, the result must directly link to a question deEned in the lit- 
erature in two ways: it must provide [part of] an answer to the question, 
and the question must still be open in that respect. Once this is established, 
there is no further need to discuss the problem. A reformulation of the ques- 
tion may be acceptable, but must not alter its foundation. If the problem is 
sufEciently open, giving a speculative answer that may, through discussion, 
lead to a narrowing of the question, can be accepted. 



p. X, cited after (5). 



1 
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Pattern 2. Relationship Contribution 

Problem Describe a development that is of interest to the audience, but not 
on their mind. 

Solution In a general Problem-Solution schema, propose your solution as 
one for a problem immediately appealing to the audience ('Known-Prob- 
lem or weaker, Topical-ProblemJ. Emphasize the aspects of the discus- 
sion where the relationship surfaces. Method-Solution is appropriate for 
the description of the solution, as it will probably emphasize higher-level 
similarities. Differences may still be handled in an Technical-Solution 
schema. Use the terminology of the audience primarily, citing their classical 
works. Cite your own classical works for further reading. 

Discussion The critical part is to make the contribution interesting enough for 
the audience to accept the different roots. One source of interest are works 
from the audience that state similar problems. As a last resort topoi may 
be employed.^ Put an emphasis on the application area and its interesting 
problems, not on your understanding of the target audience’s roots. As the 
audience probably does not speak the same language, their relevant termi- 
nology from classical works or textbooks should be used. 



Another pattern is hinted at here, to illustrate the use of the paradigm to 
non-submission documents. The Status Report (Pattern 3) may be customized 
for various authorities and have a tight interplay with schemata and type defi- 
nitions. 

Pattern 3. Status Report 

Problem Report to an authority the progress and status of a project. 
Solution For short-term reports, omit the description of goals and previously 
submitted reports. Factually describe the events in the report scope, relating 
them to goals ('Partial-Answer^. 

Discussion . . . 



This paper adopts the Relationship Contribution pattern. As can be seen 
in the implementation section (5), its background lies in somewhat different im- 
plementation technologies, but its content can well be expressed with conceptual 
structures terminology on the basis of the common paradigm of graphs (rather 
than trees or strings). 

^ In Aristotelean rhetorics, a topos is a sentence commonly believed, regardless of its 
truthfulness. In the Topica, Aristotle gives rules for such topoi, for example; “Also, 
the same things are more worthy of choice when pleasure is added than when it is 
absent, and when accompanied by freedom from pain than when attended by pain.” 
((? , III, 2, a23ff/p. 393)). 
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schema PROBLEM-SOLUTION for CONTENT(x) is 




Fig. 1. Schema Problem-Solution 

The Problem-Solution proposal references a problem, which must be of relevance to 
the audience (the target of the proposal), that is: they must have experienced the event 
as a problem. The author proposes a solution that solves the problem. Own literature 
may be cited to out-source detailed discussions or to add weight to the solution. Other 
literature is used to ground the problem and parts of the solution. 



3.2 Schemata Provide a Guide to Requirements and Structure 

Guided by the pattern, the author expands the starting concept [Content] 
with some schemata to build the model of the content to be described in the 
document. Concepts in the model may be related to concepts in the structure 
and finally to parts of the presentation. 

To support the author, we extend the definition of a schema given in (12), 
Definition 4.1.1: 

A schematic cluster for a type t is a set of monadic abstractions 
{AoiMi, . . . XanUn}, where each formal parameter Ui is of type t. Each 
abstraction XaiUi in the set is called a schema for the type t. 

An abstraction, consisting of a conceptual graph in the body and the parame- 
ters, provides little guidance on its use. So, our schema consists of not only the 
abstraction, but also of documentation describing its usage and a name. The 
documentation may elaborate on the specific role a node plays and emphasize 
key components. 



Content Schemata Help Organizing the Content Model. We begin the 
discussion of sample schemata with an overall Problem-Solution Schema. 
Schema 1 shows the base schema of a paper centered in a solution to a problem.^ 
The problem must be related to the audience of the discussion. Depending on 

® In the figures, many relations are actions rather than function words as proposed 
in (12). Technically, this is justified by Assumption 3.6.12 of (12), which introduced 
relational definitions. This leads to more compact and more human-readable graphs 
here. 







Author Support through Formalized Experience 



33 



type PROBLEM(x) is 

• 

Fig. 2. Type Problem 

A generic problem is a situation perceived as negative. The sitnation may be the 
openness of a question, the high cost of some procedure, etc. 



SITUATION:*x 



NEGATIVE 



schema ACCEPTED-PROBLEM for PROBLEM(x) is 




Fig. 3. Schema Accepted Problem 

An accepted problem is described in the literatnre known to the audience. If there is 
active research activity, the Open Problem schema is more appropriate. 



its further expansions, the problem may be related to literature known to the 
audience, or the author may ground the problem in common experience. The 
author of the document proposes a solution which solves the problem. 

A problem is characterized by being a situation which is valued as negative 
(Type 2). This definition is of little use for the author, though. In the discussion, 
a problem may be structured as a known problem (it is described in some litera- 
ture, Schema 3), or as an open problem (there are several proposed answers and 
some body of literature, Schema 4) . Alternatively, a problem may be grounded 
on topoi commonly believed by the audience (Schema 5). 

Solutions are the other side of a problem. Their type is omitted here. A 
solution may base its value on being implemented and used (Schema 6). An 
impressive example of this is found in (16), mentioning “120 documents of major 
proportions had been produced by the method” (p. 68/A-15). A conceptual or 



schema OPEN-PROBLEM for PROBLEM(x) is 



PROBLEM:*x -e TOPIC -H DESCRIPTION GRND 




LITERATURE 




Fig. 4. Schema Open Problem 

An open problem is the object of research activity, so there is literature on the overall 
goal and interim results. There may also be several not fully established answers. 
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schema TOPICAL-PROBLEM for PROBLEM(x) is 





Fig. 5. Schema Topical Problem 

The topical problem rests on the assumption that a topos is accepted in the audience. 
It needs no further justification of its existence. The relevance must still be assured. 



schema TECHNICAL-SOLUTION for SOLUTION(x) is 
I SOLUTION:*x 1 




IMPLEMENTATION 




APPL 



USAGE:{*} 



Fig. 6. Schema Technical Solution 

The solution is available in some implementation, which may have been applied in a 
number of usages. 



methodological solution may ground itself on its beauty or the novelty. Note that 
both schemata may be applied, so a solution may be somewhat implemented, 
but appeal mainly by its novelty. 

In this paper, the introduction begins with stating the topical problem of 
writing complex documents. By citing authoring literature such as (14) and 
closer exposition, it states the author’s and reader’s problems as known problems. 

Presentation Schemata Help Organize the Presentation. Depending on 
the space available and her judgment, the author expands a number of pre- 
sentation schemata, populating the presentation with a structure of the usual 
hierarchy of sections and subsections. Specific concepts in the presentation struc- 
ture may discuss concepts of the content, or one concept may subsume a number 
of others. By ordering and stating a certain depth, the schema guides the author 
in formulating a coherent text. 

In the case of this paper, a schema that discusses problems first and presents a 
solution later (Figure 7) has been used. This top-level schema is just an overview, 
as the finer points of interconnection are provided by lower-level schemata for 
the presentation of, for example, problems or related work. 

Alternative presentation schemata present a vision (there is no implementa- 
tion, even the problem may be low-key) or report on some findings (fits together 
with Open Problem Schema (Schema 4)) or give an account of how time was 
spent on a project (a natural choice for Status Report patterns). The IMRD 
format of (14) or van Dijk’s superstructures (17) are other candidates for pre- 
sentation schemata. 



Type Definitions Enforce Constraints. While schemata may be violated, 
a type definition may not. A type definition for a submission to a conference 
would usually require at least one author and a title for the document. As a 
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schema PROBLEM-SOLUTION-PRESENTATION for PRESENT ATION(x) is 




Fig. 7. Schema Problem-Solution Presentation 

In the abstract, prereqnisites (presumed facts or topoi) are skimmed, followed by a 
discernible description of the problem. The idea and main aspect of the solution are 
also mentioned. The introdnction elaborates on these aspects to ease the reader into the 
realm of the paper. The problems section describes a nnmber of problems and selects 
the ones that are about to be solved. The discussion of the solution can be layered or 
segmented in parts. In the related works section, the other work should address roughly 
the same problems, but provide no complete solution. 



means of authoring support, the type serves to enforce basic requirements and 
reduces copy editing cost at the publisher. Type constraints may also be used 
on lower levels, such as prohibiting dangling cross-references. 

4 High-Level Functionality of the CAS System Chasid 

We are applying these plans in the prototype high-level authoring support sys- 
tem Chasid (Consistent High-level Authoring and Studying with Integrated 
Documents). It extends conventional documents and authoring environments 
with graph structures that capture the content and the presentation structure. 
On this basis, schemata can be expanded, and their fulfillment can be monitored 
and supported. 

From the perspective of a user that is used to the conventional application, 
our implementation is called the “conceptual extension” or “semantics exten- 
sion” . 

The author may interactively expand schemata or types to edit the content or 
presentation structure. She may also associate presentation nodes with portions 
of the conventional document. These operations are immediately visualized in 
a graphical display. Figure 8 shows a screenshot of the running prototype, with 
open pattern and schema browsers and a cutout of the conceptual graph of this 
paper. Connections to the hierarchical presentation structure or the external 
document are not displayed. Some operations perform arbitrary type-conforming 
operations on the graph. These operations are not necessarily canonical, but they 
do not violate the type expansion rules. 

The graph structures may be scanned for problematic patterns, which are 
then flagged with warning nodes. The warnings may be dismissed on instance 
level by the author; for some warnings, there are also repair operations. For 
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Fig. 8. Screenshot of the Chasid prototype. 



example, hierarchy anomalies such as nodes with only one child are flagged. 
If the content model contains a group of concepts and that group is linked to 
a hierarchical concept, it is ensured that the latter contains the corresponding 
sub-concepts. 

Manipulations of the graph structures are propagated to the conventional 
document. Correspondingly, edits in the conventional document are propagated 
into the graph structures, if they are structurally relevant. 

The author may alternatively use the system bottom-up instead of top-down, 
as described previously: She may build the document in the conventional applica- 
tion and the graph-based extension keeps track of the structure as best as it can. 
This gives the author a graphical view of the presentation structure of the docu- 
ment. She may manipulate this, causing changes in the conventional document. 
From the structure, the content graph may be built by attaching imported and 
exported concepts to the presentation concepts. Between these content model 
concepts, relations such as prerequisite-for may be defined to trigger warnings 
should a prerequisite be described only after that which needs it. We discuss 
these features in more depth in (6). 

5 Implementation Structure of Chasid 

The system is being implemented using the high-level graph transformation lan- 
guage PROGRES."^ It is based on declaring a graph schema (closer to database 
schemata than conceptual schemata) and productions, tests, path expressions, 
transactions and consistency constraints on them. The evaluation can proceed 
deterministically in imperative command sequences, or may use nondetermin- 



4 



(8; 10), http : //www-i3 . informatik.rwth-aachen.de/research/progres. 
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safe production CMCGSProblemSolution_Proposal ( theContent : CMCG_CONTENT [1:1]) 
[ 1 : 1 ] = 



' 1 = theContent 




6 ‘ : CMCGKnow 



-CMFrom- 



Fig. 9. PROGRES production to instantiate a Problem-Solution schema 



ism and backtracking to find possible solutions. The most important language 
feature are graph rewriting operations (productions), where arbitrary graph pat- 
terns may be partially replaced with new graphs. As an example, figure 9 shows 
a graph production that inserts the schema Problem-Solution (Schema 1) 
into the current working graph. To edit PROGRES specifications, a language- 
sensitive editor and interpreter are available. 

From a PROGRES specification, a prototype application can be generated 
where the productions and transactions of the specification can be executed in- 
teractively. This prototype is connected to the conventional application through 
various a-posteriori integration techniques. The integration code detects changes 
on a very low level and translates them into transactions on a portion of the 
graph that closely mirrors the structure of the conventional document. Consis- 
tency constraints trigger transactions that propagate these changes to changes 
in the “higher” parts of the graph. 

Promoting changes back is implemented using database triggers. Each change 
in the graph causes a change in the graph database where it is stored. This in 
turn triggers events. These events are monitored by the integration code and 
translated into editing operations of the conventional application. 

The current implementation integrates with ToolBook. This application has 
been chosen because it is built on a relatively open message-passing structure. 
This choice also reflects the roots of this project in the multimedia context. 

The prototype consists of a framework written in Java (7), the C code gener- 
ated from the specification and the graph database, which is currently a legacy 
system written in Modula-3 and C. The database restricts the prototype to the 
Solaris and Linux platforms. The integration code uses Delphi and the Tool- 
Book OpenScript languages. It is integrated into ToolBook instructor II 6.1, 
which runs on MS Windows. The communication runs over a proprietary TCP 
protocol. The implementation is still far from complete. 
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6 Related Work 

The work described here touches a number of research areas that provide a 
contribution, but not a solution to the problems outlined above. 

Linguistics and its popular branches are a source for various kinds of schemata 
and patterns. T. A. van Dijk’s research in superstructures (17) did reveal some 
presentation structures, but did not rate them in terms of readability. There is 
a wealth of material on the construction and formulation of arguments, with 
Toulmin (15) one of the most well-known protagonists. These treatises provide 
schemata on more fine-grained levels than described here. 

One area where active authoring support of the kind presented here is given 
are pragmatic text schema books. As early as the 16th century, books have been 
published containing prefabricated letters for specific communication purposes 
(3). Today, such collections have been translated into electronic form and have 
become somewhat more flexible. Still, they heavily rely on the specific commu- 
nication purpose, which may only be parameterized to a certain extent. There 
are also only few checks for consistency. 

A different approach to the problem of structuring a document was adopted 
by the STOP team (16). They introduced “thematic quantization”, forcing writ- 
ing managers and writers to plan their documents so each two-page spread han- 
dles one theme. By enforcing early structuring, the task of the single writer 
becomes more productive, and editing loops are factored out. There are, how- 
ever, little hints on structure beyond the two-page spread. 

The “Writer Environment” family of computer-supported writing projects 
maintains a separation of the document into content space, rhetorical space, 
planning space and argumentation space (11; 13). The same concepts are used 
here with the content model (corresponding to the content space), presentation 
structure (argumentation space) and conventional document (rhetorical space). 
There is no separate organizing planning space as it does not directly contribute 
to solving writers’ organization problems. The development has produced highly 
flexible teamwork hypermedia authoring environments. From our perspective, 
they have two drawbacks: they do not integrate tightly with existing applications, 
and there is little support in building the structure. 

The idea of imports and exports of text blocks can be used to tailor a lesson 
for a specific reader from a large body of knowledge. This has been done by 
Nejdl (9). This system does not address the authoring support problems. Also, 
coherence may be hard to achieve when the informational units are rearranged. 



7 Conclusion: Thesis, Usage, Plans 

In this paper, I have presented an application of conceptual structures in sup- 
porting authors to construct coherent and lucid documents more efficiently. The 
application uses patterns as problem clarification guidance, various schemata 
as content and presentation structure help and type definitions for enforcing 
external requirements. 
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As the system is still in its implementation stage, it has not yet been applied 
to any text production. The concepts, however are used with pen and paper in 
the preparation of this document and some earlier publications. This has shown 
that a lasting integration of the graphs with the document is crucial: updating 
the paper graph has always been discontinued at some time during editing. 

This state defines the further plans: Implement the integration and specifica- 
tion so that the system may be applied in real-life projects. We are also looking 
at some existing texts to determine more well-grounded patterns and schemata. 
In order to reach more text-oriented authors, an integration with Microsoft Word 
is planned. 
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Abstract. Much research has focused on the problem of knowledge ac- 
cessibility, sharing and reuse. Specific languages (e.g. KIF, CG, RDF) 
and ontologies have been proposed. Common characteristics, conven- 
tions or ontological distinctions are beginning to emerge. Since knowledge 
providers (humans and software agents) must follow common conventions 
for the knowledge to be widely accessed and re-used, we propose lexical, 
structural, semantic and ontological conventions based on various know- 
ledge representation projects and our own research. These are minimal 
conventions that can be followed by most and cover the most common 
knowledge representation cases. However, agreement and refinements are 
still required. We also show that a notation can be both readable and 
expressive by quickly presenting two new notations - Formalized En- 
glish (FE) and Frame-CG (FCG) - derived from the CG linear form [9] 
and Frame- Logics [4]. These notations support the above conventions, 
and are implemented in our Web-based knowledge representation and 
document indexation tool, WebKB^ [7]. 



1 Introduction 

In [7], we argue that to permit precise, flexible and scalable retrieval and exploita- 
tion of knowledge representations (e.g. conceptual ontologies) and data indexed 
by them, the used metadata/knowledge representation languages should pos- 
sess an expressive, intuitive and concise linear form; permit the indexation of 
any document and part of document; support the use of undeclared terms; and 
permit the specification of paths in a semantic network. We argued against the 
direct use of XML-based languages such as RDF^, and shown how WebKB and 
its languages satisfy these requirements. 

Precise, flexible and scalable knowledge retrieval also requires the agents (e.g. 
Web users or robots) who generate knowledge representations to follow conven- 
tions to permit subsequent comparison of the representations, and therefore their 
retrieval and fusion. In this paper, we propose lexical, structural, semantic and 
ontological conventions based on various knowledge representation projects (es- 
pecially Conceptual Graphs [9] and RDF [1]) as well as our own research [5] [6]. 

^ http://meganesia.int.gu. edu. au/'phmartin/ WebKB/ 

^ http://www.w3.org/RDF/ 



B. Ganter and G.W. Mineau (Eds.): ICCS 2000, LNAI 1867, pp. 41-54, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 
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We also show that a notation can be both readable and expressive by intro- 
ducing two notations - Formalized English (FE) and Frame-CG (FCG) - derived 
from GGLF and Frame-Logics. We do not expect these to become standards but 
they are new alternatives to GGLF and GGIF. They support and encourage the 
above conventions and associated facilities, and are implemented in our Web- 
based knowledge representation and document indexation tool, WebKB [7]. 

2 General Conventions 

Our conventions are general but, since we use the Gonceptual Graph [GG] ter- 
minology to refer to components of knowledge representations, we assume these 
representations can be translated into such a directed graph model. 

2.1 Lexical Normalisation 

InterCap Style for Identifiers. XML^ has become the de facto standard for 
data exchange, and RDF^ and its XML notation (“RDF/XML”) will probably 
become the standard for metadata exchange. Therefore, it seems important that 
identifiers within knowledge representations have legal XML names. This is not 
particularly restrictive (URLs are permitted). More importantly, the “InterGap 
style” has been adopted in RDF for expressing terms, with a lower case first letter 
for relation types^ - as in rhetoricalRelation and subClassOf - and an upper case 
first letter for concept types® - as in TaxiDriver. These naming conventions have 
also been adopted by the “Meta Gontent Framework Using XML”^. 

High-level lexical facilities. To be used widely and reduce lexical problems, 
high-level languages or query interfaces should provide lexical facilities for the 
user. For instance, as is the case in WebKB, language analyzers should automat- 
ically normalize identifiers that include uppercase letters, dashes or underscores 
into the Intercap style, as well as exploit user-defined aliases. 

Such analyzers also accept queries or representations that use undeclared 
type names (e.g. common words) when the relevant type names can be auto- 
matically guessed via the structural and semantic constraints in the queries or 
representations and the ontologies they are based upon. When different inter- 
pretations are possible, the user should be alerted to make a choice. This last 
facility, detailed in [7], is particularly interesting when the exploited ontologies 
reuse a natural language lexical database such as WordNet®: it spares the user 
the complex and tedious work of declaring and organizing each term used. This 
facility (along with high-level notations and interfaces) seems an essential step to 
encourage Web (human) users to build knowledge representations. Similar ideas 
for the exploitation of lexical databases such as WordNet are developed in [3] . 

® http://www.w3.org/XML/ 

^ http://www.w3.org/RDF/ 

® http: / / www.w3.org/TR/REC-rdf-syntax/ ^usage 
® http://www.w3.Org/TR/1998/WD-rdf-schema/#intro 
^ http://www.w3.Org/TR/NOTE-MCF-XML/#secA. 

® http://www.cogsci.princeton.edu/'wn/ 
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Nouns for identifiers. Generally a sentence can be re-written to avoid the 
use of adjectives and verbs (with the exception of “to be” and “to have”). For 
instance, “A cat named Tom jumps toward a wooden table” may be re-written 
into “The cat which has for name Tom is agent of a jump that has for destination 
a table the material of which is some wood” . This sentence - which is a correct 
sentence in Formalized English (FE) - seems unnatural but makes the concepts 
and their relations explicit and therefore exploitable by an automated analyzer. 

The convention of using nouns, compound nouns or verb nominal forms when- 
ever possible within representations not only makes them more explicit, it also 
efficiently reduces the lexical and structural ways they may be expressed. It 
therefore increases the possibilities of matching them. 

Concept types denoted by adjectives® can rarely be organized by generaliza- 
tion relations but may be decomposed into concept types denoted by nouns. Con- 
cept types denoted by verbs can be organized by generalization relations (though 
the organization of the top-level types is difficult) but cannot be inserted into the 
hierarchy of concept types denoted by nouns (and therefore cannot be compared 
with them) unless verb nominal forms are used. These nominal forms, e.g. Driv- 
ing, also recall the need to represent the time-frame or frequency of the referred 
processes. For similar reasons, value restrictors should also be represented via 
noun phrases, e.g. Important WeightFor A Mouse and ImportantWeightForAnEle- 
phant, rather than via adjectives such as Important. 

Most identifiers in current ontologies are nouns (e.g. the Dublin Core^® or 
the Upper Cyc Ontology^^), even in relation type ontologies such as the Gen- 
eralized Upper Model^® relation hierarchy. Avoiding adverbs for relation type 
names is sometimes difficult, e.g. for spatial/temporal relations. However, this 
does not create problems in organizing relation types by generalization relations. 
What should be avoided is the introduction of relation type names such as is- 
DefinedBy and seeAlso. Better names are definition and additionalinformation. 
These names are consistent with the usual reading conventions (e.g. in CG [9] 
and RDF^®) of graph triplets {concept source, relation, concept destination}: 
“<concept source> HAS FOR <relation> <concept destination>” or 
“<concept source> IS <relation> <concept destination>” or 
“<concept destination> IS THE <reIation> OF <concept destination>” . 



Singular nouns for identifiers. Most identifiers in ontologies are singular 
nouns. Gategory names must be in the singular in the Meta Gontent Framework 
Using XML. It is therefore better to avoid the introduction of plural identifiers 
whenever possible, e.g. by using keywords within representations such as the GG 
keyword Dist that specifies that a referent is distributive. 

® We refer to types representing the meanings of some adjectives, not misnamed types 
such as Abstract when “Abstract Entity” is actually the intended meaning 
http://purl.oclc.org/dc/ 
http://www.cyc.com/cyc-2-l/cover.html 

http://www.darmstadt.gmd.de/publish/komet/gen-um/nodell.html 
http: / / www.w3.org/TR/REC-rdf-syntax/ ^statement 
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2.2 Structural and Semantic Normalisation 

We have seen that lexical conventions and facilities influence the structural and 
semantic aspects of representations. We now focus on these aspects. 

First-class relations. In CG and RDF, a relation is not local to an object, it 
is a first-class object itself and can be connected to any instance of the types 
given in the signature of the relation. This permits distributive developments 
(since anyone may represent anything about any object) and eases the process 
of comparing representations (since all terms are inter-related) . This still permits 
the representation of relations necessarily or typically associated to objects of 
a particular type. Thus, although most frame-based systems also allow local 
relations, it is better to avoid them for the sake of knowledge reuse. 



Binary basic relations. Most frame-based models (including RDF) only have 
binary and unary relations. It is therefore better for knowledge reuse to use only 
unary or binary relations in languages such as CGs that allow n-ary relations. 
Relationships of arity greater than 2 may always be represented using structured 
objects or collections, or more primitive binary relations. For instance, “the point 
A is between the points B and G” may be represented using the binary relation 
type between and a collection object grouping B and G, or using the relation types 
left and right, above and under, etc. Most often, such a decomposition makes a 
representation more explicit, precise and comparable with other representations. 

Thus, relations should rather refer to simple/primitive relationships. As a rule 
of thumb, relations should not refer to processes and should rather be named with 
simple “relational nouns”, e.g. part and characteristic. Some complex relational 
nouns such as child and driver are often too handy to be avoided but imply 
additional lexical or structural facilities (e.g. those of Ontoseek [3]). 



Avoid disjunctions, negations and collections. Representations that in- 
clude disjunctions, negations or collections are generally less efficiently exploit- 
able for logical inference than conjunctive existential formulas and IF-THEN 
rules based on these formulas 

It is often possible to avoid disjunctions and negations without loss of ex- 
pressivity using IF-THEN rules or by exploiting type hierarchies. For instance, 
instead of writing that an object X is an instance of DirectFlight OR of In- 
directFlight, it is better to declare X as an instance of a type Flight that has 
DirectFlight and IndirectFlight as exclusive subtypes (i.e. types that cannot have 
common subtypes or instances). Exclusion links between types or in some cases 
between whole formulas are kinds of negations that can be handled efficiently, 
and are included in many expressive but efficient logic models, e.g. Gourteous 
logic on which the Business Rules Markup Language (BRML) is based. 

The introduction of identifiers for collections may also be avoided using key- 
words such as Dist to specify a distibutive interpretation or Col for a collective 

http://www.oasis-open.org/cover/brmf.html 
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interpretation. Type definitions are also a way of representing facts about col- 
lections of objects that knowledge representation systems generally handle more 
efficiently than if collections are directly used. 

Contexts are often unavoidable for expressivity sake but they can be handled 
efficiently if treated as positive contexts [2], that is, when only their structures 
are taken into account, and not the special semantics of the terms they include. 

For efficiency reasons again, many knowledge representation systems, espe- 
cially frame-based systems, work on rooted graphs. In a rooted graph, there is a 
head node representing the “central” object of the representation, i.e. the object 
the other nodes detail. Therefore, to help the translations of representations for 
various kinds of systems, it seems better to begin each representation with its 
central object, whatever the language used. 



Precision, term definitions and constraints. The more precise the repre- 
sentations are, the less chance they conflict with each other and the more they 
can be cross-checked, merged and exploited to answer queries adequately. Hence, 
constraints should be associated to types, and representations should rather be 
contextualized in space, time and author origin. No relevant concept should be 
implicit. For instance, instead of representing that “birds fly”, it seems better to 
represent that “a study made by Dr Foo (Foo@bird.org) found that in 1999, 93% 
of healthy birds can fly” and categorize the species of birds under the exclusive 
subtypes BirdWhichCanNormallyFly and Bird WhichCannotNormally Fly. 

Most notations make it very difficult to represent such precise statements. 
Even Sowa’s CG linear format (CGLF)[9] needs to be extended to refer to the 
distributive interpretation of 93% of all instances of a type X; we use the form 
“[X: V @93%]” in the GGLF representation of the previous example: 
[Description: [Situation: [PhysicalPossibility : 

[ [Bird: A] ->(chrc) -> [Health: ©good] : V @93’/.] <- (Agent )<- [Flight] 

] ]-> (time)-> [Date :" 1999"] 

] -> (source) -> [Study] -> (author) -> [Person: Foo@bird.org] 

In Section 3.1 below, we give the translations of this example in FGG and FE 
to show that more intuitive notations are possible and useful. 

Before doing so, let us note that representations that include precise domain- 
oriented terms should still be retrievable via queries which include more general 
natural language terms. A way to do this is to specialize the terms of a natural 
language ontology such as WordNet with the domain-oriented terms. Extend- 
ing such an ontology is often quicker (and safer) than creating an ontology 
from scratch, ensures a better reusability of the representations and automatic 
comparisons with representations based on the same ontology. These issues are 
discussed and implemented in Ontoloom/Powerloom^®. 



It would have been very difficult to represent “birds of most species can fly” and the 
result hardly exploitable for logical inference. 

http://www.isi.edu/isd/OntoLoom/hpkb/OntoLoom.html^RTFToC18 
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3 Notations 

To permit Web users to precisely index Web documents, or more generally, 
represent knowledge, the notation(s) they use needs to be both expressive and 
intuitive. Otherwise, they will not use it or they will be forced or encouraged 
to represent information inadequately, which greatly reduces the value of the 
representations. A usual concern is that an increase in expressiveness leads to a 
model and a language too complex to handle efficiently. Actually, with structured 
models such as frames, RDF or CG, an inference engine may easily take more or 
less features into account to provide the degree of precision/efficiency required 
by a function or an application. For instance, a search engine can do a good and 
efficient job exploiting simple structure matching techniques and considering 
all contexts (e.g. modalities and negation) as positive contexts, so long as the 
retrieved representations are displayed with their associated contexts. 

In this section, we list knowledge representation cases that are common in 
natural language sentences but rarely taken into account by current general- 
purpose knowledge representation languages. We do not assume any kind of 
exploitation or formalization. We simply show how these cases can be represented 
in CGLF, FGG and FE, and why it sometimes remains ambiguous in GGLF. 

We created FGG and FE in order to have notations both more readable and 
expressive than GGLF (and therefore GGIF). We believed it was a necessary 
step for our Web-based information annotation/retrieval tool to be usable and 
useful. FE (Formalized English) has the same readablility and precision purposes 
as other controlled languages such as Attempto Gontrolled English (AGE)^^ but 
is not domain-dependent and is more expressive (none of the following examples 
seems to be representable with AGE). FGG and FE are GG notations derived 
from GGLF: the structure has been kept but some syntactic sugar is different. 
Because arrows have been removed (mainly replaced by colons in FGG and key- 
words such as “that has for” in FE) and since common English articles or other 
“modifier” words can be used as quantifiers, these notations seem simpler than 
GGLF (and sometimes more expressive where set quantification and modalities 
are concerned) . A “frame-like” notation similar to FGG and also using “a” , “the” 
and “every” as quantifier keywords is proposed by the Knowledge Machine [8] for 
the same readability purpose (however, it does not have other quantifiers). We 
cannot detail these notations here but their EBNF grammars and the Yacc-|-Lex 
grammars for their automatic translation to GGLF are Web-accessible 

All the terms in our examples are WordNet nouns apart from some keywords, 
URLs, relation types and concept type instances used as measure values. 

3.1 Presentation of FCG and FE 

Here is the previous example represented in English, GGLF, FGG and FE. 

http://www.ifi.unizh.ch/groups/req/staff/fuchs/. See also the Boeing Simplified En- 
glish (http://www.boeing.com/assocproducts/sechecker/) for examples of restricted 
though still informal English. 

http: / / meganesia.int .gu.edu.au / 'phmartin/ WebKB /doc / grammars / 
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E: A study made by Dr Foo (Foo@bird.org) found that in 1999, 

93°/, of healthy birds can fly. 

CGLF : [Description: [Situation: [PhysicalPossibility : 

[ [Bird: A] -> (chrc)-> [Health: @good] : V @93°/,] <- (Agent) <- [Flight] 

] ]-> (time)-> [Date :" 1999"] 

] -> (source) -> [Study] -> (author) -> [Person: Foo@bird.org] 

FCG: [[[93°/, of (bird, chrc : a good health) , agent of#:a f light] , time : 1999] , 

source: (a study, author: Foo@bird.org)] 

FE: ‘‘‘93°/, of [bird with chrc a good health] can be agent of a flight’ 

time 1999’ with source a study that has for author Foo@bird.org’. 

Contexts are delimited by square brackets in FCG and quotes in FE. At 
the same context level, structuration is done via parenthesis in FCG and the 
use of comma or keyword “and” in FE. Lambda-expressions are delimited by 
parenthesis in FCG, square brackets in FE. Apart from these distinctions, both 
notations share the same features: the quantifier keywords (e.g. “a” and “the” 
as existential quantifiers, “several” and “at least” as collection quantifiers), the 
lexical facilities (e.g. the automatic normalization of terms) and other facilities 
(e.g. the automatic typing of contexts - and the creation of intermediary contexts 
if necessary - according to the signatures of the relations connected to them) . 

3.2 Existential Statements, Contexts and Sentence Delimiters 

Simple information or sentences from documents can be represented using most 
general-purpose knowledge representation languages (including CG, KIF and 
RDF) since they permit to represent existential statements and contextualize 
them (e.g. to represent modalities). We therefore do not detail these features. 

However, most of these languages, even metadata languages such as XML 
or RDF, still lack delimiters to include arbitrary sentences of any language (e.g. 
English or XML) inside the representations. FCG and FE have several pairs of 
delimiters, including “$(“ and “)$’. The next FCG illustrates a use for them with 
the representation of the title of certain document. It also shows how the details 
of a representation are made explicit by representing the representation process 
itself with a concept and then connecting binary relations to it. 

[a representation, agent: philippe.martin@gu.edu.au, language: FE, 

ontology: http : //www. int . gu.edu. au/~phmartin/WebKB/kb/KADSlontol .html , 
creationDate : "21/01/1999", expirationDate : "22/7/9999", 
source: (the title, part of: http://foo.bar.org/KADSl.html), 
object: $( KADS-I models in CG )$, /* <- text of the title */ 
result: $( KADS-I has for part several models that are object of 

a representation which has for language CG. )$ /*FE repr.*/ ] 

3.3 Collections and Intervals 

Sowa [9] uses the symbols Dist, Col and Cum to make explicit the distribu- 
tive, collective or cumulative interpretation of a collection referent. Dist or Col 
should not appear in more than one concept of a CG since this would lead to 
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ambiguities. Therefore, coreferences between collection concepts of various CGs. 
Unfortunately, this has not been dealt with by Sowa. The following example 
illustrates shows the problem with CGLF. 

E: Together, Fred, Tom and another man approved a resolution. 

A certain resolution was approved by each of them. 

CGLF: [Person: *s ColfFred, Tom, *}@3] <- (approver)<- [Resolution] 

[Set:*s] [*s Distf*}] <-(approver)<- [Resolution: ©certain] 

FCG: [*g the group of persons *p {Fred, Tom, a man]-, 

approver of: a resolution] [a resolution, approver: *p] 

FE: *g the group of persons *p {Fred, Tom, a man]- 

is approver of a resolution. A resolution has for approver *p. 

This example also shows how we used [Set ; *g] in the GGLF to specify that 
each member of the group *g is different. In FGG and FE, a collection is by 
default a set. The distributive interpretation is also the default. The collective 
interpretation is specified with the keyword group and referred to via a variable 
- here *g. In the GGLF, we used Sowa’s keyword Scertain to specify that each 
member approved the same resolution. In FGG and FE, this is made explicit via 
the order of the concepts (as it is in English; we have adopted this solution be- 
cause it is intuitive and it permits us to combine other kinds of quantifiers as the 
following examples show). The ’s’ at the end of the terms used for representing 
collections are automatically removed by the FGG parser. 

Many keywords or special types need to be specified to permit the repre- 
sentation of collections, that is, (i) their kinds: Set, Bag, OR-Bag, XOR-set, etc. 
(ii) their restrictors: most, mostly, at most, dozens, etc. Below is an exam- 
ple involving two collections (“*r” is supposed to be pre-declared) . The GGLF 
statement is ambiguous (the scopes of the quantifiers are not explicit). 

E: At least 3 persons, including Fred, 

have each approved most of the resolutions "r" . 

CGLF: [Person: Dist{Fred, *}@>=3] <-(Approver) <- [*r {*}@most] 

FCG: [most of *r, approver: at least 3 persons {Fred,*}] 

FE: Most of *r has for approver at least 3 persons {Fred,*}. 



3.4 Universal Quantifiers 

The keywords for collections are handy to reuse for quantifying over the instances 
of a type. If no restrictor is used (as in [File :V] ->(Author) -> [Agent] ), rela- 
tions necessarily connected to the instances of a type (i.e. necessary conditions) 
are defined. The use of restrictors such as “most” or percentages are a way to 
define “typical” relations. Number intervals and keywords such as “at least” 
and “at most” may be used for representing relations of “entity-relationships” 
models, as in the FGG [any company, employee: at least 1 person]. 

Modalities and physical possibilities may be represented via contexts. For 
readability and normalisation reasons, we introduced special keywords in FE 
(can and may) and FGG (# : and <=). Thus, the following FGG represents the sen- 
tence “any description describes something and may be believed by a cognitive 
agent”: [any description, descr of: a thing, believer<= a cognitive_agent] 
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For the same reasons, we allowed some operators (=>, <=>, <=, =, ! = , <, 
=<. >. >=) to be used as relations types in FCG and FE. Here is an example of 
the use of the logical operator => and of the use of coreference inside relations 
for writing a second-order statement that is easy to read (in FCG) . 

FCG: [ [a relationXype *r, chrc: the transitivity], => 

[ [a thing *x, *r: (a thing *y, *r: a thing *z)] , => [*x, *r: *z]]] 
CG: ~[ [RelationXype: *r] -> (chrc)-> [Xransitivity] 

~[ ~[ [Xhing: *x]->(p r)->[Xhing: *y]->(p r) -> [Xhing: *z] 

~[ [*x]-Xp r)->[*zl ] ] ] ] 



4 Terms and Conventions for Ontological Cases 

We now focus on how objects of certain categories can be inter-related. 

4.1 Some General Categories 

Relatively few top-level concept types are required for the signatures of most 
kinds of relations useful for general knowledge representation, e.g. for the repre- 
sentation of natural language sentences or images from documents. These con- 
cept types and the constraints associated to them (e.g. the exclusion links be- 
tween them) are useful for organizing ontologies, guiding knowledge modelling 
and preventing certain inconsistencies. We detailed this in [7] and introduced a 
top-level ontology of 150 concept types and 150 basic relation types. We have 
used these concept types for organizing the upper levels of the WordNet on- 
tology. In the remainder of this article, we focus on the most generic of these 
concept types and represent the necessary or possible relationships between their 
instances. In this way, we also propose a model for knowledge representation. 

Before doing so, let us highlight with an example how these general types help 
render precise the knowledge representation process. We obtained a hierarchy of 
terms about computer and network technology. It appeared that the hierarchy 
was not a generalization hierarchy since it mixed various kinds of objects and 
therefore various kinds of relations. Here is an annotated extract. 

Computing //process or domain? 

Computer_hardware //physical object! 

Compiler //hardware or software? 

Computer_software //description I 

Software_language //description medium! 

Applications //process description (may or may not be software)! 
Networking //process or domain? 

Here is, represented in FGG, a part of the ontology that we have built to 
make explicit the general category of each of the above terms and thus permit 
semantic checks and the use of relations associated to the general categories. The 
author of the previous hierarchy had to be contacted to determine what some 
of the terms meant. Indentation is only for presentation purposes. 

The relation partition connects a type to a set of partitions, each being a set of 
exclusive types. Thus, two pairs of brackets are required even for a single partition. 
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[Domain_object, subtype: -[Computing_object Networking_object}] 

[Networking_object .partition: ■[{Network_software ,Network_hardware}}-] 
[Physical_entity , subtype: Hardware] 

[Hardware, subtype: {Computer_hardware , Network_hardware}] 

[Network_hardware , partition: {{Local area network, Switch}-}] 
[Description, subtype: {Software, Application}] 

[Descr_medium, subtype: {Sof tware_language , Network_lEuiguage}] 

Here are the generalization and exclusion relations between our most generic 
concept types. Indentation is only for presentation. 

[Thing (“the supertype of all first order concept types“) , 
partition: { {Situation, Entity}, {Thing_playing_a_role} }] 

[Situation (“a thing that occurs in a region of time and space") , 
partition: {{State, Process}, {Phenomenon}, 
{Situation_playing_a_role}}] 

[Entity (“a thing that may be involved in a situation") , 

partition: {{Inf ormation_entity,Temporal_entity, Spat ial_entity} , 
{Collection}, {Entity_playing_a_role} }] 

[Spatial_entity ("eui entity that occupies a space region") , 
partition: {{Space_location, Physical_entity , 
Imaginary_spatial_entity}}] 

[Physical_entity ("a spatial entity made of matter") , 
partition: {{Inanimate_object , Living_thing} , 

{Goal_directed_entity ("cognitive entity")}}] 

[Inf ormation_entity ("for information or its representations") , 
partition: { {Description, Descr_container , 

Characteristic, Measure, Measure_unit} }] 

[Description ("description of a situation") , 

partition: { {Description_content , Description_medium} }] 
[Description_content .partition: {{Belief .Hypothesis .Narration}}] 
[Descript ion_medium , partition : {{Symbol , Syntax , Language , Script}}] 
[Description_container .partition: {{File , Image .Document _element}}] 
[Characteristic (" a dimension of something "), 
partition: {{Psychological_characteristic , 

Physical_characteristic , Situation_characteristic}}] 

[Measure, partition: {{Psychological_measure , 

Physical_measure , Situation_measure}}] 

[Measure_unit , partition: {{Psychological_measure_unit , 

Physical_measure_unit ,Situation_measure_unit}}] 
[Thing_playing_a_role ("subcategorisation is domain-dependent") 
partition: {{Domain_object},{Thing_needed_f or_a_process} , 

{Entity_playing_a_role ,Situation_playing_a_role}}] 
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4.2 Descriptions 

We now focus on necessary and possible relationships from concepts of the types 
listed above. We begin with “descriptions” and related objects. 

[coiy thing, 

descr <= a description, //anything may be described 

descr_in<= a descr_container , //e.g. in a file, a hologram 

] 



[coiy description, //or "proposition" 

descr of : a thing, 

descr_medium : a descr_medium, //symbols or a language 

descr_container : a descr_container , //e.g. in a file 
author : 1 entity, //unique author 

believer <= a cognitive_agent , modality <= a modality, 
logical_relation<=a description, rhetorical_relation<=a description 

] 



[any descr_container , descr_support : a physical_entity , 
descr_in of : a thing] 

[cuiy descr_medium, descr_medium of: a thing] 

[ [a thing *t , descr_in: a descr_container *c] , 

<=> [*t , descr: (a description, descr_container : *c)] ] 

Examples of logical relation types are Or and Xor. Examples of rhetorical 
relation types are summary, motivation and antithesis. 



4.3 Characteristics and Measures 

When representing characteristics, the characteristic itself should be distinguish- 
ed from its measure(s), as in [a pen, physChrc: (a length, measure: 
12 cm)] . Since this habit does not come naturally, abbreviations such as the fol- 
lowing should probably be adopted as conventions: [a pen, length: 12 cm]. 
However, this implies additional constraints on the ontology and on its exploita- 
tion by the analyzers, e.g. Length should be declared as a subtype of a type 
Characteristic and this would have to be a predefined type in any analyzer. Here 
are relations for explicit representations. 

[any thing, chrc <= a characteristic] //e.g. speed, ingenuity 
[any characteristic, measure <= a measure, chrc of <= a thing] 

[any measure, quantity: a number, unit: a measure_unit , 
measure of : a characteristic] 

[any physical_entity , physChrc <= a physical_characteristic] 

[any goal_directed_entity ,physChrc<=a psychological_charact eristic] 

A lot of concept types may be found in WordNet for physical or psychological 
features, e.g. Memory and Cognition, but unfortunately those types are not well 
organized and often mixed with misclassified types such as Mind, Lexicon and 
Structure. 
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4.4 Situations, Processes and Temporal Entities 

Here are some representations of relationships that are - or maybe - connected 
to situations, processes and temporal entities. Most of the relations types were 
proposed by [9]. 

[any situation, 

s_succ of: a situation, s_succ: a situation, 
location : a spatial_entity , 

time: a temporal_entity , duration : a duration, 
situationChrc<= a situation_characteristic 

] 

[cuiy temporal_entity , 

time of <= a situation, duration of <= a situation, 
temporal_order_relation <= a temporal_entity 

] 

[cUiy process, 

triggering_event<= eui event, ending_event <= an event, 

ending <= a state, ending of <= a state, 

precondition <= a state, postcondition <= a state. 



initiator 


<= a 


goal_directed_agent , 


agent 


<= an entity. 


instrument 


<= an entity. 


object 


<= a thing. 


experiencer<= a 


conscious_agent , 


recipient 


<= an agent , 


result 


<= a 


thing. 


sub_process<= a process. 


manner 


<= a 


situation_characteristic , 





method <= a description, source <= a spatial_entity , 

destination<= a spatial_entity , path <= a spatial_entity 

] 



The relation types input and output may respectively be declared as subtypes 
of object and result. Further specializations are input-Output, object Jo-modify and 
object.to.mute. 

Given we may use intervals and exclusive sets for representing time concepts, 
only two relation types seem necessary between situations and temporal entities: 
time and duration. Here are examples of two typical cases. 

E: On the 21/12/1999, John went to his office between 13h and 14h. 

FCG: [[a travel, agent: John, destination: (an office, owner: John), 
time: 13 to 14 hour], time: "21/12/1999"] 



E: John usually takes 20 min or 40 min to go to his office. 

FCG: [most of (travel, agent: John, 

destination: (an office, owner: John)), 
duration: {20 min I 40 min}] 

These examples do not violate the signatures of time and duration. The 
following representation, which uses a context, should be considered by the ana- 
lyzers as equivalent to the first of the above two examples. 

[ [John, agent of: a travel, destination: (an office, owner: John)], 
time: 13 to 14 hour] 
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4.5 Miscellaneous 

Here are additional common kinds of relationships. 

[cmy thing, part <= a thing] //anything may have at least 1 part 
[any physical_entity , material: a physical_entity] 

[cmy collection, subset <= a collection, 

element<= a thing, count: a natural] 

[Drder_relation, domain: Thing, range: Thing, 

partition: { {Spatial_order_relation, Temporal_order_relation]-, 

{Meet , In, Near , Before , After} }] 

[Spatial_order_relation, subtype: {On, Above, Below}] 

[On, subtype of: {Meet , Above}] 

We have defined relation types such as meet and near as direct subtypes of 
order -relation and allowed them to connect any pair of concepts. Specializations 
of these types, e.g. spatiaLmeet and temporaLmeet, could be defined to allow the 
use of more precise and constrained types upon which further semantic checks 
could be done. Such precise modelling may be found in the CYC and Ontolingua 
top-level ontologies (for instance, 2D and 3D spatial relations are distinguished) . 
However, we cannot expect the average user to spend his time looking for the 
most specific terms in such libraries. Nonetheless, these libraries could be ex- 
ploited by authoring tools to automatically find more specific relations that do 
not violate the signatures of the general relation used. 

Though relations such as part or subset are partial order relations like subtype, 
for the sake of precision, they should not be directly connected to concept types. 
For instance, [Airplane , part : Wing] might be intended to represent the fact 
that “any airplane has for part a wing”, but many alternative interpretations 
are possible: “any wing is part of a plane”, “a wing is part of any plane”, etc. 

5 Conclusion 

Information can be represented in many ways. For knowledge representations to 
be automatically comparable, conventions must be followed by authoring agents. 
We have proposed general lexical, structural and semantic conventions, then exa- 
mined some knowledge representation cases that are common in natural language 
sentences but rarely taken into account by current general-purpose knowledge 
representation languages. We have introduced two notations (Frame-CG and 
Formalized English) that support and guide the use of these conventions (e.g. 
the syntax, the quantifiers and restrictors - “a”, “the”, “several”, etc. - lead to 
the use of nouns as identifiers), cover the listed knowledge representations cases 
and remain intuitive. We argued that an inference engine can exploit expressive 
languages efficiently (at the expense of precision) by ignoring some of the more 
complex features. Finally, some top-level concept types and their relationships 
were provided to guide and check knowledge representation. 

As highlighted above, the precise models that are found in CYC and Ontolin- 
gua top-level ontologies are certainly useful but will doubtfully be used directly 
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by human agents to represent information or quickly index documents, sentences 
or images. Instead, we expect people (e.g. Web users) to utilize a small set of 
relation types and use common words for concepts types: given the signatures of 
the relation types, a lexical database such as WordNet may be exploited by an 
authoring tool to derive the relevant concept types or ask the user for more pre- 
cision [3] [7]. Users will also probably use scalable multi-user knowledge servers 
to refer, refine or complement lexical databases. 
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Abstract. The Internet is a giant semiotic system. It is a massive col- 
lection of Peirce’s three kinds of signs: icons, which show the form of 
something; indices, which point to something; and symbols, which rep- 
resent something according to some convention. But current proposals 
for ontologies and metadata have overlooked some of the most impor- 
tant features of signs. A sign has three aspects: it is (1) an entity that 
represents (2) another entity to (3) an agent. By looking only at the 
signs themselves, some metadata proposals have lost sight of the entities 
they represent and the agents - human, animal, or robot - which inter- 
pret them. With its three branches of syntax, semantics, and pragmatics, 
semiotics provides guidelines for organizing and using signs to represent 
something to someone for some purpose. Besides representation, semi- 
otics also snpports methods for translating patterns of signs intended 
for one pnrpose to other patterns intended for different but related pur- 
poses. This article shows how the fundamental semiotic primitives are 
represented in semantically equivalent notations for logic, including con- 
trolled natural languages and various computer languages. 



1 Problems and Issues 

Ontologies contain categories, lexicons contain word senses, terminologies con- 
tain terms, directories contain addresses, catalogs contain part numbers, and 
databases contain numbers, character strings, and BLOBs (Binary Large OB- 
jects). All these lists, hierarchies, and networks are tightly interconnected col- 
lections of signs. But the primary connections are not in the bits and bytes that 
encode the signs, but in the minds of the people who interpret them. The goal 
of various metadata proposals is to make those mental connections explicit by 
tagging the data with more signs. Those metalevel signs themselves have further 
interconnections, which can be tagged with metametalevel signs. But meaning- 
less data cannot acquire meaning by being tagged with meaningless metadata. 
The ultimate source of meaning is the physical world and the agents who use 
signs to represent entities in the world and their intentions concerning them. 

The study of signs, called semiotics, was independently developed by the 
logician and philosopher Charles Sanders Peirce and the linguist Ferdinand de 
Saussure. The term comes from the Greek sema (sign); Peirce originally called it 
semeiotic, and Saussure called it semiology, but semiotics is the most common 
term today. As Saussure (1916) defined it, semiology is a field that includes all 
of linguistics as a special case. But Peirce (CP 2.229) had an even broader view 
of that includes every aspect of language and logic within the three branches of 
semiotics: 
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1. Syntax. "The first is called by Duns Scotus grammatica speculativa. We may 
term it pure grammar." Syntax is the study that relates signs to one another. 

2. Semantics. "The second is logic proper," which "is the formal science of 
the conditions of the truth of representations." Semantics is the study that 
relates signs to things in the world and patterns of signs to corresponding 
patterns that occur among the things the signs refer to. 

3. Pragmatics. "The third is... pure rhetoric. Its task is to ascertain the laws 
by which in every scientific intelligence one sign gives birth to another, and 
especially one thought brings forth another." Pragmatics is the study that 
relates signs to the agents who use them to refer to things in the world and 
to communicate their intentions about those things to other agents who may 
have similar or different intentions concerning the same or different things. 

According to Peirce, semiotics is the science that studies the use of signs 
by "any scientific intelligence." By that term, he meant "any intelligence capa- 
ble of learning by experience," including animal intelligence and even mindlike 
processes in inanimate matter. By Peirce’s criteria, computer techniques for pro- 
cessing knowledge bases and databases could be called computational semiotics. 

Unfortunately, most word processors deal only with a small subset of syn- 
tax. They have produced what St. Laurent (1999) calls the WYSIWYG disaster. 
"Plain text, dull though it may be, is much easier to manage than the output 
of the average word processor or desktop publishing program." In practice, the 
slogan "What you see is what you get" actually means WYSIAYG: "What you 
see is all you get." The text is so overburdened with formatting tags that there 
is no room for semantics or pragmatics. The so-called Rich Text Format (RTF) 
is semantically the most impoverished representation for text ever devised. For- 
matting is an aspect of signs that makes them look pretty, but it fails to address 
the more fundamental question of what they mean. 

To address meaning, the markup languages in the SGML family were de- 
signed with a clean separation between formatting and meaning. When properly 
used, SGML and its successor XML use tags in the text to represent semantics 
and put the formatting in more easily manageable style sheets. That separa- 
tion is important, but the semantic tags themselves must have a clearly defined 
semantics. Most XML manuals, however, provide no guidelines for represent- 
ing semantics. Following is an excerpt from one of the proposed standards for 
representing resources in XML: 

A resource can be anything that has identity. Familiar examples in- 
clude an electronic document, an image, a service (e.g., "today’s weather 
report for Los Angeles"), and a collection of other resources. Not all re- 
sources are network "retrievable"; e.g., human beings, corporations, and 
bound books in a library can also be considered resources. (Berners-Lee, 
et al. 1998) 

In that report, an electronic document is considered familiar, but human be- 
ings are unfamiliar "resources" mentioned only as an afterthought. Yet without 
the people, the document and its contents have no meaning. 
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Many of the ontologies for web objects ignore physical objects, processes, 
people, and their intentions. A typical example is SHOE (Simple HTML On- 
tology Extensions), which has only four basic categories: String, Number, Date, 
and Truth (Heflin et al. 1999). Those four categories, which are needed to de- 
scribe the syntax of web data, cannot by themselves describe the semantics. 
Strings contain characters that represent statements that describe the world; 
numbers count and measure things; dates are time units tied to the rotation of 
the earth; and truth is a metalanguage term about the correspondence between 
a statement and the world. Those categories can only be defined in terms of the 
world, the people in the world, and the languages people use to talk about the 
world. Without such definitions, the categories are meaningless tags that confer 
no meaning upon the data they are attached to. 

In discussing the Resource Description Framework (RDF), which is based 
on the XML facilities, Bray (1998) presented a broader view of the kinds of 
categories that web-based metadata should represent: 

It seems unlikely that one PropertyType standing by itself is apt to be 
very useful. It is expected that these will come in packages; for example, 
a set of basic bibliographic PropertyTypes like Author, Title, Date, and 
so on. Then a more elaborate set from OCLC, and a competing one 
from the Library of Congress. These packages are called Vocabularies; 
it’s easy to imagine PropertyType vocabularies describing books, videos, 
pizza joints, fine wines, mutual funds, and many other species of Web 
wildlife. 

This is a good statement of one issue, but it raises other issues: How are the 
packages related to one another? How is the Date property of the OCLC package 
related to the Vintage property of a wine package? Can packages inherit type 
definitions from other packages? If two packages are competing, is there any way 
to define conversion rules for translating or redefining the types of one in terms 
of another? A human reader may know that a wine vintage can be compared to 
an OCLC date, but without a formal definition, the computer cannot. 

Ironically, the computer networks that make it easier to transmit data have 
made it more difficult to share data. In continuing his discussion, Bray raised 
further issues: 

Nobody thinks that everyone will use the same vocabulary (nor should 
they), but with RDF we can have a marketplace in vocabularies. Any- 
one can invent them, advertise them, and sell them. The good (or best- 
marketed) ones will survive and prosper. Probably, most niches of infor- 
mation will come to be dominated by a small number of vocabularies, 
the way that library catalogues are today. 

There are already thousands, if not millions of competing vocabularies. The 
tables and fields of every database and the lists of items in every product catalog 
for every business in the world constitute incompatible vocabularies. When prod- 
uct catalogs were distributed on paper, any engineer or contractor could read 




58 



John F. Sowa 



the catalogs from different vendors and compare the specifications. But minor 
variations in the terminology of computerized catalogs can make it impossible 
for a computer system to compare components from different vendors. 

By standardizing the notations, XML and RDF take an important first step, 
but that step is insufficient for data sharing without some way of comparing, 
relating, and translating the vocabularies. Phipps (2000) warned that standard- 
izing the vocabularies may create even more difficulties "by hiding complexities 
behind superficial agreements": 

To connect from the heart of my e-business to the heart of yours 
would be impossibly expensive in shared systems without XML, but 
even with it the system analysis needed to create the translation is a 
significant task. We should not assume that XML is a panacea or that the 
standardization of vocabularies will automatically bring interoperability. 
XML provides us with a medium to express our understanding of the 
meaning of data, but we still have to discern realities and differences of 
meanings when we exchange data. 

More important than standardizing vocabularies is the development of meth- 
ods for defining and translating vocabularies. To have a sound semantics and 
pragmatics, those methods must relate the terms in the vocabularies to the 
things they refer to and to the people who use them to communicate informa- 
tion about those things. 

The purpose of this paper is to analyze the differences of meaning, to explore 
their implications for web-based metadata, and to show how the methods of 
logic and ontology can be used to define, relate, and translate signs from one 
vocabulary to another. Among the methods discussed in this paper are Peirce’s 
systems of logic, ontology, and semiotics, which are presented in more detail in 
the book Knowledge Representation"^ by Sowa (2000). 

2 Signs of Signs 

Metalanguage consists of signs that signify something about other signs, but 
what they signify depends on what relationships those signs have to each other, to 
the entities they represent, and to the agents who use those signs to communicate 
with other agents. Figure 1 shows the basic relationships in a meaning triangle 
(Ogden and Richards 1923). On the lower left is an icon that resembles a cat 
named Yojo. On the right is a printed symbol that represents his name. The 
cloud on the top gives an impression of the neural excitation induced by light 
rays bouncing off Yojo and his surroundings. That excitation, called a concept, 
is the mediator that relates the symbol to its object. 

The triangle in Figure 1 has a long history. Aristotle distinguished objects, 
the words that refer to them, and the corresponding experiences in the psyche. 
Frege and Peirce adopted that three-way distinction from Aristotle and used it 

^ See URL http://www.bestweb.net/.sowa/krbook/ 
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Concept 




Object 



Yojo 

Symbol 



Fig. 1. The meaning triangle 



as the semantic foundation for their systems of logic. Frege’s terms for the three 
vertices of the triangle were Zeichen (sign) for the symbol, Sinn (sense) for the 
concept, and Bedeutung (reference) for the object. As an example, Frege cited 
the terms morning star and evening star. Both terms refer to same object, the 
planet Venus, but their senses are very different: one means a star seen in the 
morning, and the other means a star seen in the evening. Following is Peirce’s 
definition of sign'. 

A sign, or representamen, is something which stands to somebody 
for something in some respect or capacity. It addresses somebody, that 
is, creates in the mind of that person an equivalent sign, or perhaps a 
more developed sign. That sign which it creates I call the interpretant 
of the first sign. The sign stands for something, its object. It stands for 
that object, not in all respects, but in reference to a sort of idea, which 
I have sometimes called the ground of the representamen. (CP 2.228) 

The terms morning star and evening star are distinct signs that create dif- 
ferent concepts or interpretants in the mind of the listener. Both concepts stand 
for the same object, but in respect to a different ground, which depends on the 
time of the observation. 

Aristotle observed that symbols could symbolize other symbols, as "written 
words are symbols of the spoken." Frege said that his logic could be used as a 
language to talk about the logic itself. But Peirce went further than either of 
them in recognizing that multiple triangles could be linked together in different 
ways by attaching a vertex of one to a vertex of another. By stacking another 
triangle on top, Figure 2 represents the concept of representing an object by a 
concept. The upper triangle relates the cloud that suggests the concept of Yojo to 
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the symbol [Cat : Yojo], which is a printable symbol for the more elusive neural 
excitation. At the very top is a cloud for the neural excitation that occurs when 
some person recognizes that Yojo is being represented by a printed symbol. 




Fig. 2. Concept of representing an object by a concept 



Meaning triangles can be linked side by side to represent signs of signs of 
signs. On the left of Figure 3 is the triangle of Figure 1, which relates Yojo to his 
name. The middle triangle relates the name Yojo to the quoted string "Yojo". 
The rightmost triangle relates that character string to its encoding as a bit string 
0x596F6A6F. In each of the three triangles, the symbol is related to its object 
by a different metalevel process: naming, quoting, or representing. At the top 
of each triangle, the clouds that represent the unobservable neural excitations 
have been replaced by concept nodes that serve as printable symbols of those 
excitations. The concept node [Cat: Yojo] is linked by the conceptual relation 
node (Name) to a node for the concept of the name [Word: "Yojo"], which is 
linked by the conceptual relation node (Repr) to a node for the concept of the 
character string itself [String: ’Yojo’]. The resulting combination of concept 
and relation nodes is an example of a conceptual graph (CG). 

To deal with meaning, semiotics must go beyond relationships between signs 
to the relationships of signs, the world, and the agents who observe and act upon 
the world. Symbols are highly evolved signs that are related to actual objects by 
previously established conventions. People agree to those conventions by relating 
the symbols to more primitive signs, such as icons, which signify their objects by 
some structural similarity, and indices, which signify their objects by pointing 
to them. All these signs can be related to one another by linking series or even 
arrays of triangles. Additional triangles could show how a name is related to the 
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Fig. 3. Object, name of object, symbol of name, and character string 



person who assigns the name, to the reason for giving an object one name rather 
than another, or to an index that points to some location where the object may 
be found. 

Different kinds of applications require different levels of detail in the meta- 
data. For information retrieval (IR) , a simple string search can often find a web 
page with the desired information. To find information about Yojo the cat, it 
could search for the strings "Yojo" and "cat"; to find information about Quee- 
queg’s ebony idol in the novel Moby Dick, it could search for the strings "Yojo" 
and "Queequeg". IR systems depend on a human reader to decide which strings 
to search for and to interpret the results that are retrieved. Systems that go be- 
yond simple search must be able to distinguish the physical object Yojo from an 
icon that resembles the object, the name of the object, and the character string 
that represents the name. Following is an interchange between a human user who 
asked a question and a computer system that did not make those distinctions: 

Q: What is the largest state in the US? 

A: Wyoming. 

To answer questions about sizes, the computer would use the greater-than 
operator to compare numbers. When it applied that operator to the character 
strings, it found the last state in alphabetical order, which does not happen 
to be the largest state in either area or population. A loosely defined system 
of metadata may be adequate for finding information, but inadequate for any 
further processing. As Phipps observed, superficial agreements about vocabulary 
may hide complexities that make interoperability impossible. 



3 Logic 

The second branch of semiotics is semantics, or as Peirce called it, logic proper 
- the subject that studies what it means for a pattern of signs to represent a 
true proposition about the things the signs refer to. The first complete system 
of first-order logic (FOL) was the Begrijfsschrift by Gottlob Frege (1879), who 
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developed a notation that no one else, not even his very few students, ever 
adopted. The second complete system was the algebraic notation for predicate 
calculus, independently developed by Charles Sanders Peirce (1883, 1885). With 
minor modifications, it became the most commonly used version of logic during 
the twentieth century. It is a much better notation than Frege’s Begriffsschrift, 
but for many people, it is "too mathematical." The third complete system was 
Peirce’s existential graphs of 1897, which he called his chef d’ oeuvre - a strong 
claim by a man who invented the most widely used version of logic today. 

With existential graphs, Peirce set out to determine the simplest, most primi- 
tive forms for expressing the elements of logic. Although he developed a graphical 
notation for expressing those forms, they can be expressed equally well in a nat- 
ural language, an algebraic notation, or many different linear, graphical, or even 
spoken representations. The following table lists Peirce’s five semantic primi- 
tives, each illustrated with an English example. Since these five elements are 
primitive, they cannot be formally defined in terms of anything more primitive; 
instead, the middle column of the table briefly states their "informal meaning." 



Primitive 


Informal Meaning 


English Example 


Existence 


Something exists. 


There is a dog. 


Coreference 


Something is the same as something. 


The dog is my pet. 


Relation 


Something is related to something. 


The dog has fleas. 


Conjunction 


A and B. 


'The dog ts running, 
and the dog is barking. 


Negation 


Not A. 


The dog is not sleeping. 



Table 1. Five semantic primitives 



The five primitives in Table 1 (cf. Section 1) are available in every natu- 
ral language and in every version of first-order logic. They are called semantic 
primitives because they go beyond syntactic relations between signs to semantic 
relations between signs and the world. Any notation that is capable of expressing 
these five primitives in all possible combinations must include all of FOL as a 
subset. As an example, the WHERE clause of the SQL query language can express 
each of these primitives and combine them in all possible ways; therefore, first- 
order logic is a subset of SQL. Different languages may use different notations 
for representing the five primitives: 



— Existence. In most natural languages, existence is implied by mentioning 
something. For emphasis, languages also provide an explicit existential quan- 
tifier such as the word some. In the algebraic notation for logic, existence 
may be expressed by an explicit symbol, such as 3 . In SQL, existence is 
stated implicitly by mentioning something or explicitly by using the key- 
word EXISTS. 
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— Coreference. To say that two different signs refer to the same thing, natural 
languages use a variety of methods, both explicit and implicit: pronouns, 
determiners, inflections, and forms of the verb be. Most linear notations for 
logic use variables and the equal sign, and graphic notations use connecting 
lines or ligatures. Like other linear notations, SQL uses variables and the 
equal sign. 

— Relation. Content words in natural languages express some information about 
at least one entity, known as the referent of the word, but they may also re- 
late or imply other entities as well. The verb give, for example, refers to an 
act of giving, but it also implies a giver, a gift, and a recipient. In SQL, 
relations are called tables. 

— Conjunction. In both natural and artificial languages, conjunction may be 
expressed implicitly by making one statement after another or explicitly by 
a word like and or a symbol like A . SQL uses the keyword AND. 

— Negation. All natural languages and most versions of logic provide words, 
inflections, or symbols to express negation. The biggest variations from one 
language to another are in the methods for distinguishing the context or 
scope of what is negated from what is not negated. SQL uses the keyword 
NOT with parentheses to show scope. 

Other logical operators can be defined in terms of these five primitives. Table 
2 (cf. Section 2) shows three of the most common: the universal quantifier, im- 
plication, and disjunction. These operators do not qualify as semantic primitives 
because they are not as directly observable as the five in Table 1 (cf. Section 1). 
Seeing Yojo, for example, is evidence that some cat exists, but there is no way 
to perceive every cat. Seeing two things together is evidence of a conjunction, 
and not seeing something is evidence of a negation. But there is no direct way 
of perceiving an implication or a disjunction: post hoc does not imply propter 
hoc, and seeing one alternative of a disjunction does not indicate what other 
alternatives are possible. Although the three operators of Table 2 (cf. Section 2) 
can be defined in terms of the five primitives, any assertion they make about 
the world can only be verified indirectly and usually with less certainty than the 
basic primitives. 



Operator 


English Example 


Translation to Primitives 


Universal 


Every dog is barking. 


not{there is a dog and not (it is barking)) 


Implication 


If there ts a dog, 
then it is barking. 


not {there is a dog and not (it is barking)) 


Disjunction 


A dog is barking, or 
a cat is eating. 


not (not (a dog is barking) and not (a cat is eating)) 



Table 2. Three defined logical operators 



Instead of choosing existence and conjunction as primitives, Frege chose the 
universal and implication as primitives. Then he defined existence and conjunc- 
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tion in terms of his primitives. The result was not as readable as Peirce’s al- 
gebraic notation, but it was semantically equivalent. Peirce’s existential graphs 
(EGs) were also semantically equivalent to both of the other notations, but they 
had the simplest of all mappings to the five primitives. SQL also uses existence, 
conjunction, and negation as its three basic primitives, but it provides the key- 
word OR as well. SQL has no universal quantifier, which must be represented 
by a paraphrase of the form NOT EXISTS. . . NOT. To add logical operators to 
RDF, Berners-Lee (1999) proposed the tags <not> and <exists>, which can 
be combined with the implicit conjunction of RDF to define the operators of 
Table 2 (cf. Section 2). 

To illustrate various notations for logic and their relationships to RDF, con- 
sider a typical sentence that might be used in a database specification: Every 
human being has two distinct parents, who are also human beings. Since this 
sentence introduces numbers and plurals, which go beyond the five primitives, 
start with the simpler sentence Some human has a parent, who is also human. 
Figure 4 shows an existential graph that represents the sentence. 

FG for ’Some human has a human parent.’ 



Human HasParent Human 

Fig. 4. EG for Some human has a parent who is human 



In an existential graph, the words represent predicates or relations, and the 
bars represent existential quantifiers. The two bars in Figure 4 represent two 
individuals who are human, and the one on the left has the one on the right as a 
parent. In the algebraic notation, each bar would be assigned a variable, such as 
X or y, and an existential quantifier, represented by the symbol 3 . As a result. 
Figure 4 would map to the following formula: 

(3 x) (3 y) (Human (x) A HasParent (x,y) A HumauiCy)). 

The symbol A in the formula represents conjunction, which is implicit in 
the FG and RDF notations. Figure 4 could be represented by a triple in RDF: 
the first human could be treated as an RDF resource, the HasParent relation as 
an RDF property type, and the second human as an RDF value. The existence 
of the human on the left would be indicated by the proposed RDF quantifier 
<exists var="x">, and the one on the right by <exists var="y">. 

The FG in Figure 5 would require an additional relation or property type 
before it could be represented in RDF. It represents the sentence Some human 
has one parent who is human, another parent who is human, and the two are 
not identical. 

The bar that represents (3 x) in Figure 4 is connected to both copies of the 
HasParent relation in Figure 5. Two bars represent each of the human parents. If 
they were connected, they would represent the same individual; but to represent 
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Human 




HasParent 



HasParent 




Human 



Human 



Fig. 5. EG for Some human has two distinct human parents 



distinct individuals, the connection must be negated. In existential graphs, Peirce 
used an oval to indicate negation; in Figure 5, the oval negates part of the 
connecting bar. In the algebraic notation, Figure 5 would be represented by the 
following formula: 

(3 x)(3 y)(3 z) 

(Human(x) A HasParent (x,y) A Human (y) 

A HasParent (x,z) A Human (z) A yyfz) . 

The inequality y^z corresponds to the negated connecting bar in Figure 5. 
In EGs, the bar that represents a variable also represents coreference, and its 
negation represents inequality. Notations that have variables, such as predicate 
calculus, SQL, and RDF, must also have a coreference operator, such as = and 
its negation yf. With new property types for Equal and NotEqual, Figure 5 could 
be represented in RDF by three existential quantifiers linked together by three 
RDF triples. 

The small oval in Figure 5 is sufficient to negate the connection between the 
bar for one parent y and the bar for the other parent z, but an oval can be 
made as large as necessary to show the scope of negation. To show a universal 
quantifier. Table 2 (cf. Section 2) shows that two negations are necessary, which 
are represented by a pair of large ovals in Figure 6. Literally, the resulting graph 
may be read It is false that there exists a human being who does not have two 
distinct parents. It corresponds to the following formula: 

~(3 x) (Human (x) A~(3 y) (3 z) 

(HasParent (x,y) A Human (y) 

A HasParent (x,z) A Humcui(z) A yyfz)) . 

Two copies of the proposed RDF tag <not> and its ending tag </not> 
could be nested to provide the equivalent of the two nested ovals in Figure 6. To 
make RDF equivalent to existential graphs, however, new RDF rules would be 
needed to restrict the scope of the quantified variables to the contexts enclosed 
by the tags <not> and </not>. 

As Table 2 (cf. Section 2) illustrates, a pair of negations can represent either 
a universal quantifier or an implication. The EG in Figure 6 may be read in 
either way. If the two ovals are considered an implication. Figure 6 could be read 
If there exists a human, then that human has a parent who is human and another 
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parent who is human and the two parents are distinct. Another option is to read 
an existential quantifier nested between two ovals as the universal quantifier ", 
which is expressed by the English word every. Then Figure 6 could be read Every 
human has a parent who is human and another parent who is human and the two 
parents are distinct. By using the defined operators of Table 2 (cf. Section 2), 
the formula could be rewritten in a form that shows the universal quantifier V 
and the implication D explicitly: 

(V x) (Human (x) D (3 y) (3 z) 

(HasParent (x,y) A Human (y) 

A HasParent (x,z) A HumEui(z) A yyf z)) . 

In English, this formula may be read For every x, if x is human, then there 
exist a y and a z such that x has the human y as parent, x has the human z as 
parent, and y and z are not the same individual. 

With their minimal number of operators, Peirce’s EGs have a single canon- 
ical form instead of the multiple synonymous sentences in languages with more 
built-in operators, such as English and predicate calculus. That property, which 
is sometimes an advantage, can be a disadvantage when the most natural or 
convenient translation is not obvious. Conceptual graphs (Sowa 1984, 2000) are 
a graphic notation for logic based on existential graphs, but with extended fea- 
tures that support more direct translations to natural languages. Figure 7 shows 
a conceptual graph that corresponds to the existential graph in Figure 6. 

Logically, the CG in Figure 7 is semantically equivalent to the EG in Figure 
6. To indicate the intended reading, the CG has two boxes explicitly labeled If 
and Then instead of the EG ovals. Unlike EGs, which use a bar to represent 
existential quantification, coreference, and connections between relations, those 
three functions are distinguished in CGs: boxes, called concept nodes, represent 
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Fig. 7. CG for If there is a human, then he or she has two distinct human parents 



quantification; dotted lines represent coreference; and solid lines represent con- 
nections between the concept nodes and the relation nodes. The node [T] in the 
Then context, which is coreferent with the node [Humcui] in the If context, cor- 
responds to a pronoun, such as he, she, or it. Altogether, Figure 7 may be read 
If there is a human, then he or she has two distinct human parents. To improve 
the readability of logic expressed in RDF, Berners-Lee also proposed the tags 
<if> and <then> as synonyms for <not>. 

Natural languages have a variety of quantifiers, such as the words every, 
some, or all, the numbers two, seventeen, or half, and the phrases more than 
six or at least as many. Those generalized quantifiers can be defined in logic by 
adding Peano’s axioms to define numbers and set theory to define collections, 
but it is convenient to have such quantifiers built into the notation. In CGs, the 
default quantifier is the existential 3 , which is normally represented by a blank, 
but concept nodes may also contain defined quantifiers, such as the symbol " or 
©every to represent the English word every. The CG in Figure 8 is equivalent to 
Figure 7 by the definition of the quantifier V . It maps to the following formula 
in typed predicate calculus: 

(V x: Human) (3 y,z: Human) 

(HasParent (x,y) A HasParent(x,z) A yy^z) . 

In typed logic, monadic predicates such as Human (m) are replaced by type 
labels associated with the variables. The typed formula is more concise, but 
logically equivalent to the untyped formulas that represent the EG of Figure 6. 

Figure 8 could be represented in RDF with the proposed <f orall> quanti- 
fier, but CGs also support other generalized quantifiers that have not yet been 
considered for RDF. As an example. Figure 9 simplifies Figure 8 by introducing 
the generic plural symbol {*} to represent a set and the number 2 to represent its 
count or cardinality. The resulting CG can be mapped to the following formula: 

(V x: Human) (3 s:Set)(V y G s) 

(Count(s,2) A HasParent (x,y) AHuman(y)). 
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Fig. 8. CG for Every human has two distinct human parents 



This formula may be read For every x of type Human, there exists an s of 
type Set such that for every y in s, the count of s is 2, x has y as parent, and 
y is human. The generalized quantifier {*}@2 in the CG maps to two quantified 
variables in predicate calculus: a variable s that ranges over sets and a variable 
y that ranges over the elements of the set s. 




Fig. 9. CG for Every human has a set of two human parents 



The CG in Figure 9 is closer to English, but it still isn’t quite as simple as 
the more natural sentence Every human has two parents. That sentence could 
be expressed directly by the CG in Figure 10. 




Fig. 10. CG for Every human has two parents 



In English, the HasParent relation is normally expressed by the verb have 
combined with the noun parent. That noun belongs to a large class of role words, 
such as spouse, pilot, lawyer, assistant, pet, weed, crop, entrance, obstacle, or 
facility. Syntactically, those words resemble nouns like man, woman, dog, or 
tree] but semantically, they imply some relationships to external entities. In 
the ontology of the book Knowledge Representation (Sowa 2000), the primitive 
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relation Has is used to form dyadic relations by combining with concept types 
that represent roles. Figure 11 shows how the HasParent relation is defined in 
terms of the relation Has and the role type Parent. 

HasParent :: = 




Fig. 11. Definition of the HasParent relation 



Figure 11 defines the HasParent relation as a synonym for a conceptual graph 
that has two concepts designated as formal parameters: the symbol Ai marks 
the first parameter as a human who has a parent that is coreferent with another 
human, marked as the second parameter by the symbol A 2 . It may be read The 
HasParent relation is defined as a relation between two humans; the first has a 
parent who is the second. With this definition, Figure 10 can be mapped to or 
from Figure 9. With appropriate definitions of sets and numbers, Figure 9 can 
be mapped to or from Figure 8, which can be mapped to or from the existential 
graph or any of the algebraic formulas in typed or untyped predicate calculus. 
To support equivalent definitions, RDF would require a tag such as <lajnbda> 
or <parm> to mark a formal parameter. 

In addition to the semantic primitives of Table 1 (cf. Section 1), Peirce distin- 
guished a context-dependent primitive, which he called an indexical. In natural 
languages, indexicals are represented by pronouns, by deictic words such as this 
and that, and by noun phrases marked by the definite article the. In conceptual 
graphs, indexicals are marked by the symbol. The concept [Cat], for example, 
represents some unspecified cat that happens to exist; but the concept [Cat: 
#] represents the cat that was most recently mentioned in the current context. 
Peirce observed that proper names are also indexicals. Within the context of this 
article, the name Yojo may refer to a cat or to Queequeg’s ebony idol. On the 
Internet, it also refers to some Japanese people, to others who have adopted that 
word as a nickname, and to an organization of young journalists. The ambiguity 
of names and their context dependencies are major concerns addressed by the 
naming conventions of the Internet. Those conventions are semiotic features that 
can be represented by metalevel types and relations in conceptual graphs and 
other logic-based notations. 

In summary, the algebraic notation for logic, which is popular with mathe- 
maticians, is only one of an open-ended number of semantically equivalent nota- 
tions. The five semantic primitives of Table 1 (cf. Section 1) and the mechanisms 
for defining the other operators of first-order logic can be adapted to a wide va- 
riety of notations, including natural languages and the web-oriented notations 
of XML and RDF. 
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1. Logic can be and has been represented in a wide variety of graphic and linear 
notations of varying degrees of readability and suitability for different kinds 
of applications. EGs and CGs are graphic examples, and the Knowledge 
Interchange Format (KIF) is an equivalent linear form. Other linear versions 
can be written with the syntactic conventions of SQL, RDF, or even natural 
languages. 

2. For better readability, it is possible to represent the logical operators in con- 
trolled natural languages, which use a subset of the syntax and vocabulary 
of natural languages. Although the task of translating unrestricted natural 
languages to any formal notation is still a research problem, it is much eas- 
ier to translate conceptual graphs and other formal notations to a stylized 
version of natural language, such as the English readings given for Figures 
4 through 11. 

3. Besides notation, logic has rules of definition and inference, which allow one 
representation to be translated to or from other synonymous representa- 
tions. Figures 6 through 10 can be translated automatically to or from one 
another or the equivalent formulas in predicate calculus - provided that an 
appropriate ontology has been defined. With its formally defined semantics, 
logic provides the means for generating semantically equivalent translations 
to and from other languages with radically different syntax. 

For better readability, any of the logical notations mentioned in this section 
can be translated to controlled natural languages. One important application, 
for example, is the generation of comments and help messages automatically 
from the implementation. Such translations would guarantee that the comments 
and help would always be up to date, consistent with the implementation, and 
immediately available in every supported national language. 



4 Combining Logic with Ontology 

Pure logic is ontologically neutral. It makes no presuppositions about what exists 
or may exist in any domain or any language for talking about the domain. To 
represent knowledge about a specific domain, it must be supplemented with an 
ontology that defines the categories of things in that domain and the terms that 
people use to talk about them. The ontology defines the words of a natural 
language, the predicates of predicate calculus, the concept and relation types of 
conceptual graphs, the classes of an object-oriented language, or the tables and 
fields of a relational database. To illustrate the issues of defining an ontology, 
consider the conceptual graph in Figure 12, which represents the sentence Yojo 
is chasing a mouse. 

Figure 12 uses three concepts and two conceptual relations. The concept 
[Cat : Yo j o] represents a cat named Yojo; [Chase] represents an instance of chas- 
ing; and [Mouse] represents a mouse. The conceptual relation (Agnt) indicates 
that Yojo is the agent of chasing, and (Thme) indicates that the mouse is the 
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Fig. 12. CG for Yojo is chasing a mouse 



theme or the one that is being chased. The CG is logically equivalent to the 
following formula in typed predicate calculus: 

(3 x:Cat)(3 y: Chase) (3 z: Mouse) 

(nameCx, "Yojo") A agnt(y,x) A thme(y,z)) . 

This formula and the CG in Figure 12 introduce several ontological assump- 
tions: there exist entities of types Cat, Chase, and Mouse; some entities have 
character strings as names; and Chase can be linked to concepts of other entities 
by relations of type Agent and Theme. 

The representation of actions by distinct concepts follows Peirce’s ontology, 
which represents an action such as chasing with three distinct entities: the one 
that is chasing, the one that is being chased, and the act of chasing itself. The 
relations (Agnt) and (Thme) are examples of the case relations or thematic roles 
of linguistics. Instead of Peirce’s ontology, which is also called event semantics, 
some logicians would represent the verb is chasing by a single predicate, such as 
chases: 

(3 x:Cat)(3 y:Mouse) 

(nameCx, "Yojo") A chases (x,y)) . 

The ontology of this formula could also be used in a conceptual graph: 

[Cat: Yojo] (Chases) [Mouse]. 

This CG, which is written in the linear notation for CGs, can be translated 
to Figure 12 by defining the relation (Chases) in terms of the concept [Chase]: 

Chases : : = 

[Animate: Ai] (Agnt) — ^ [Chase] (Thme) [MobileEntity : A 2 ] . 

With this definition of (Chases), the ontology of the previous CG can be 
translated to or from the ontology assumed in Figure 12. 

Although the Chases relation allows shorter graphs and formulas than the 
concept [Chase], it introduces other complexities into the ontology. A general 
representation for tenses and modality, for example, would require a proliferation 
of relation types, such as HasChased, WillChase, and MustHaveBeenChasing. 
Furthermore, the dyadic relation chases (cc,y) makes no provision for attaching 
adverbs and other modifiers to the verb. Figure 13 takes advantage of the more 
general representation to define the concept type Chase in terms of a graph for 
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Ctiase ::= 




Fig. 13. Definition of the concept type Chase 



an animate agent (parameter ^1) that is following a mobile entity (parameter 
^2) in a rapid manner. 

Figure 13 is only a partial definition because it represents a necessary, but 
not a sufficient condition. Runners in a race, for example, might be following 
one another rapidly, but only because they are pursuing a common goal. A 
more complete definition must include the purpose, which might be different for 
different senses of the word chase. Figure 14 defines one sense, called ChaseHunt, 
in which the purpose of the agent is to catch the mobile entity that is being 
chased. 



ChaseHunt ::= 




Fig. 14. Definition of the concept type ChaseHunt 



In Figure 14, the purpose relation (Purp) links the action to a situation that 
would occur upon the successful completion of the chase. According to Peirce, 
purpose is a triadic relation, of which two arguments are shown explicitly in 
Figure 14. The implicit third argument is the agent of Chase, whose intention is 
to bring about the desired situation. That situation is nested inside a context box 
because its intentional status is different from the context of the act of chasing. 
If the chase is unsuccessful, the act of catching might never occur. Figure 15 
defines another concept type ChaseAway, in which the agent’s purpose is not to 
catch the mobile entity, but to cause it to leave its current location. 
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ChaseAway :: = 




Fig. 15. Definition of the concept type ChaseAway 



The ontology of situations and their representation in contexts is based on 
Peirce’s logic combined with ideas developed in artificial intelligence, linguistics, 
philosophy, and logic over the past 40 years (Sowa 2000). A context box may 
enclose modal or intentional situations, as in Figures 14 and 15, or it may enclose 
temporally or spatially separated parts of a larger situation. In Figure 16, the 
large situation with its sequence of nested situations represents the following 
passage in English: 

At 10:17 UTC, there was a situation involving a cat named Yojo and 
a mouse. Yojo chased the mouse. Then he caught the mouse. Then he 
ate the head of the mouse. 

These sentences show how indexicals are used to make context-dependent 
references. When new entities are first mentioned, they are introduced with the 
indefinite article, as in the phrases a situation, a cat named Yojo, and a mouse. 
The two middle sentences refer to the mouse with the definite article the and 
to the cat with the name Yojo or the pronoun he. In the last sentence, the 
head of the mouse, which had not been mentioned explicitly, is marked with the 
definite article because the introduction of the mouse implicitly introduces all of 
its expected parts. In Figure 16, the indexicals are marked with the ^ symbol: 
the pronoun he is represented as #he, and the definite article the is represented 
with the symbol by itself. 

The large context box of Figure 16 encloses the entire situation, which oc- 
curred at the point in time (PTim) of 10:17 UCT. It contains concept nodes 
that represent the cat Yojo, the mouse, and three nested situations connected 
by the (Next) relation. Before that CG can be translated to predicate calculus, 
the indexicals must be resolved to links or labels that explicitly show the coref- 
erences. To avoid multiple line crossings. Figure 17 introduces the coreference 
labels *x for Yojo and *y for the mouse. Subsequent references use the same 
labels, but with the prefix ? in the bound occcurrences of [?x] for Yojo and [?y] 
for the mouse. The symbol in the concept [Head: #] of Figure 16 is erased in 
Figure 17, since the head of a normal mouse is uniquely determined when the 
mouse itself is identified. 
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Fig. 16. Nested situations with unresolved indexicals 



After the indexicals have been resolved to coreference labels, Figure 17 can 
be translated to the following formula in typed predicate calculus: 

(3 si: Situation) (pTimCsl," 10: 17 UTC") 

A dscr (si , 

(3 si, s2,s3; Situation) (3 x:Cat)(3 y :Mouse) (name(x, "Yojo") 

A dscr(s2, (3 u:Chase) (agnt(u,x) A thme(u,y))) 

A dscr(s3, (3 v:Catch) (agnt(v,x) A thme(v,y))) 

A dscr(s4, 

(3 w:Eat)(3 z:Head) (agnt(w,x) A ptnt(w,z) A part(y,w))) 
A next(s2,s3)A next(s3,s4)))) . 

The description predicate dscr ( s , p) , which corresponds to the context boxes 
of Figure 16, is a metalevel relation between a situation s and a proposition p 
that describes s. Figure 17 or its translation to predicate calculus could also 
be paraphrased in a version of controlled English that uses variables to show 
coreference explicitly: At 10:17 UTC, there was a situation s involving a cat x 
named Yojo and a mouse y. In the situation s, x chased y; then x caught y; then 
X ate the head of y. 
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Fig. 17. Nested situations with indexicals resolved 



The contexts of conceptual graphs are based on Peirce’s logic of existential 
graphs and his theory of indexicals. Yet the CG contexts happen to be isomor- 
phic to the similarly nested discourse representation structures (DRS), which 
Hans Kamp (1981a, b) developed for representing and resolving indexicals in 
natural languages. When Kamp published his first version of DRS, he was not 
aware of Peirce’s graphs. When Sowa (1984) published his book on conceptual 
graphs, he was not aware of Kamp’s work. Yet the independently developed 
theories converged on semantically equivalent representations; therefore, Sowa 
and Way (1986) were able to apply Kamp’s techniques to conceptual graphs. 
Such convergence is common in science; Peirce and Frege, for example, started 
from very different assumptions and converged on equivalent semantics for FOL, 
which 120 years later is still the most widely used version of logic. Independently 
developed, but convergent theories that stand the test of time are a more reliable 
basis for standards than the consensus of a committee. 

5 Extracting Logic from Language 

Since all combinations of the five primitives of Table 1 can be expressed in every 
natural language, it is possible to represent first-order logic in a subset of any 
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natural language. Such a subset, called a stylized or controlled NL, can be read by 
anyone who can read the unrestricted NL. As examples, the English paraphrases 
of the CGs and formulas in this article represent a version of controlled English. 
With an appropriate selection of syntax rules, that subset could be formalized 
as a representation of FOL that would be semantically equivalent to any of the 
common notations for logic. 

For most people, no training is needed to read a controlled NL, but some 
training is needed to write it. For computers, it is easy to translate a controlled 
NL to or from logic, but fully automated understanding of unrestricted NL is 
still an unsolved research problem. To provide semiautomated tools for analyzing 
unrestricted language, Doug Skuce (1995, 1998, 2000) has designed an evolving 
series of knowledge extraction (KE) systems, which he called CODE, IKARUS, 
and DocKMan (Document-based Knowledge Management). The KE tools use a 
version of controlled English called ClearTalk, which is intelligible to both people 
and computers. As input, the KE tools take documents written in unrestricted 
NL, but they require assistance from a human editor to generate ClearTalk as 
output. Once the ClearTalk has been edited and approved, further processing by 
the KE tools is fully automated. The ClearTalk statements can either be stored in 
a knowledge base or be written as annotations to the original documents. Because 
of the way they’re generated, the comments that people read are guaranteed to 
be logically equivalent to the computer implementation. 

The oldest logic patterns expressed in controlled natural language are the four 
types of statements used in Aristotle’s system of syllogisms. Each syllogistic rule 
combines a major premise and a minor premise to draw a conclusion. Following 
are examples of the four sentence patterns: 



1. Universal affirmative. Every employee is a person. 

2. Particular affirmative. Some employees are customers. 

3. Universal negative. No employee is a competitor. 

4. Particular negative. Some customers are not employees. 

These patterns and the syllogisms based on them are used in many controlled 
language systems, including ClearTalk. For inheritance in expert systems and 
object-oriented systems, the major premise is a universal affirmative statement 
with the verb is, and the minor premise is either a universal affirmative or a 
particular affirmative statement with is, has, or other verbs. For database and 
knowledge base constraints, the major premise is a universal negative statement 
that prohibits undesirable conjunctions, such as employee and competitor. 

Other important logic patterns are the if-then rules used in expert systems. 
In some rule-based systems, the controlled language is about as English-like as 
COBOL, but others are much more natural. Attempto Controlled English (Fuchs 
et al. 1998; Schwitter 1998) is an example of a rich, but unambiguous language 
that uses a version of Kamp’s theory for resolving indexicals. Following are two 
ACE rules used to specify operating procedures for a library database: 




Ontology, Metadata, and Semiotics 



77 



If a copy of a book is checked out to a borrower 
and a staff member returns the copy 
then the copy is available. 

If a staff member adds a copy of a book to the library 
and no catalog entry of the book exists 
then the staff member creates a catalog entry 

that contains the author name of the book 
and the title of the book 
and the subject area of the book 
and the staff member enters the id of the copy 
and the copy is available. 

Rules like these are translated automatically to the Horn-clause subset of 
FOL, which is the basis for Prolog and many expert system languages. The 
subset of FOL consisting of Horn-clause rules plus Aristotelian syllogisms can 
be executed efficiently, but it is powerful enough to specify a Turing machine. 

For database queries and constraints, natural language statements with the 
full expressive power of FOL can be translated to SQL. Although many NL 
query systems have been developed, none of them have yet become commer- 
cially successful. The major stumbling block is the amount of effort required to 
define the vocabulary terms and map them to appropriate fields of the database. 
But if KE tools are used to design the database, the vocabulary needed for the 
query system can be generated as a by-product of the design process. As an 
example, the RECIT system (Rassinoux 1994; Rassinoux et al. 1998) uses KE 
tools to extract knowledge from medical documents written in English, French, 
or German and translates the results to a language-independent representation 
in conceptual graphs. The knowledge extraction process defines the appropri- 
ate vocabulary, specifies the database design, and adds new information to the 
database. The vocabulary generated by the KE process is sufficient for end users 
to ask questions and get answers in any of the three languages. 

Design and specification languages have multiple metalevels. As an example, 
the Unified Modeling Language has four levels: the metametalanguage defines 
the syntax and semantics of the UML notations; the metalanguage defines the 
general-purpose UML types; a systems analyst defines application types as in- 
stances of the UML types; finally, the working data of an application program 
consists of instances of the application types. To provide a unified view of all 
these levels, Olivier Gerbe and his colleagues at the DMR Gonsulting Group im- 
plemented design tools that use conceptual graphs as the representation language 
at every level (Gerbe et al. 1995, 1996, 1997, 1998, 2000). For his PhD disser- 
tation, Gerbe developed an ontology for using GGs as the metametalanguage 
for defining GGs themselves. He also applied it to other notations, including 
UML and the Gommon KADS system for designing expert systems. Using that 
theory, Gerbe and his colleagues developed the Method Repository System as 
an authoring environment for editing, storing, and displaying the methods used 
by the DMR consultants. Internally, the knowledge base is stored in concep- 
tual graphs, but externally, the graphs can be translated to web pages in either 
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English or French. About 200 business processes have been modeled in a total 
of 80,000 CGs. Since DMR is a Canadian company, the language-independent 
nature of CGs is important because it allows the specifications to be stored in 
the neutral CG form. Then any manager, systems analyst, or programmer can 
read them in his or her native language. 

Translating an informal diagram to a formal notation of any kind is as difficult 
as translating unrestricted NL to executable programs. But it is much easier to 
translate a formal representation in any version of logic to controlled natural 
languages, to various kinds of graphics, and to executable specifications. Walling 
Gyre and his students have developed KE tools for mapping both the text and the 
diagrams from patent applications and similar documents to conceptual graphs 
(Gyre et al. 1994, 1997, 1999). Then they implemented a scripting language 
for translating the CGs to circuit diagrams, block diagrams, and other graphic 
depictions. Their tools can also translate CGs to VHDL, a hardware design 
language used to specify very high speed integrated circuits (VHSIC). 

No single system discussed in this paper incorporates all the features desired 
in a KE system, but the critical research has been done, and the remaining work 
requires more development effort than pure research. Figure 18 shows the flow 
of information from documents to logic and then to documents or to various 
computational representations. The dotted arrow from documents to controlled 
languages requires human assistance. The solid arrows represent fully automated 
translations that have been implemented in one or more systems. 



Documents 



-> 
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Languages 




{ Controlled 
Languages 



XML 




Documents 



Fig. 18. Flow of information from documents to computer representations 



For the KE tools, the unifying representation language is logic, which may 
be implemented in different subsets and notations for different tools. All the 
subsets, however, use the same vocabulary of natural-language terms, which map 
to the same ontology of concepts and relations. From the user’s point of view, 
a KE system communicates in a subset of natural language, and the differences 
between tools appear to be task-related differences rather than differences in 
language. 
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Abstract. Here is my attempt to find the roots of "the ontological problem" in 
AI research identified by Daniel Kayser at ICCS'98. I reconstruct just enough 
of C.S. Peirce's "scientific philosophy" to suggest how pragmatism responds to 
fundamental (metaphysical) issues in Knowledge Representation, and to 
indicate how Kayser's notion of "variable ontology" for "conceptual 
adaptation" might be interpreted as pragmatic ontology, a model methodology 
for Conceptual Structures research and development. At least, here may be an 
introduction to further investigations? 



1 Introduction 

I take my starting point from Daniel Kayser's plenary paper for ICCS in 1998, under 
the cordial title "Ontologically Yours." He considers the distinction between the 
traditional use of the term "ontology" in philosophy and the current use of it by AI 
researchers in building conceptual structures, finding nothing wrong with that "so 
long as what is meant remains clear," which his investigation indicates is not the case. 
Furthermore, he says both the philosophical and the AI meanings of the term are 
inadequate for appreciating the challenges faced in conceptual structures 
development. He specifically argues that AI's "quest for a unique ontology" of 
anything more than perfectly defined mathematical notions is a wrongly conceived 
objective (see his reference to Mugnier's example, 1: 47). And he concludes that 
"ontological multiplicity" implies the need for "conceptual adaptation," which is 
essential if we want tools truly adapted to real life domains with a high standard of 
rigor, not to their idealization [1: 47]. 

Kayser maintains that the question of how Knowledge Representation (KR) 
researchers choose their entities of concern (whether these are concepts, relations, 
individuals, or whatever) is indeed an ontological question. He indicates why it has 
not been considered so, and says traditional philosophy offers no help. His leading 
question asks: "When AI considers some questions as being ontological, does it really 
mean that we are concerned with metaphysical problems?" [1: 35] 

AI is mainly concerned with the adequacy of its models for a purpose, and that 
adequacy, Kayser asserts, "is not grounded on being 'real'; [and furthermore] being 
realistic, i.e. aiming at the representation of 'all' aspects of reality, goes in the opposite 
direction to being adequate, for obvious complexity reasons"[l: 35]. He sums up the 
difference: AI wants to find the basis for efficient models; philosophy wants to 
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discuss the problem of real existence. "Their goals are not compatible, and their 
methods are completely contradictory" [1: 35-36]. 

I will specifically respond to Kayser's arguments and suggestions, after outlining 
what C.S. Peirce's "scientific philosophy" might offer in analyses of existence, reality, 
and representation. At the end of the nineteenth century, Peirce confronted 
philosophical circumstances that are historically significant in explaining what Kayser 
distinguishes as the AI and the philosophical goals. Peirce took issue with both goals, 
in terms of the nominalist understanding of representation (which replaced 
metaphysical questions of existence in philosophy with what amounts to a contest of 
models) and "seminary metaphysics” (which did not effectively distinguish reality 
from existence). Kayser's paper introduces the contemporary context of ontological 
issues for which Peirce urged construction of a "scientific philosophy" and claimed 
that pragmatism was the method needed to develop metaphysics as an essential part 
of it. 



2 Modern Philosophy and Metaphysical Horror 

Helpfully, Kayser begins examination of the ontological difficulty in terms of how 
"the layman" understands what "to exist" means, which he refers to as "the most 
'natural' way of existing." He says, for example, "we often forget that 'tree' is a word 
that people use to refer to a category of objects, and that the existence of trees is 
nothing more than an agreement that . . . [specific objects] share common properties 
that we consider as important" [1: 36]. He then distinguishes that "natural existence" 
from "existence inside a model" in the mathematical terms of tautologies. He admits 
that the distinction between these two is ill-defined, because the first one "refers to an 
agreement about the 'obvious model' of Nature, hence existence in Nature is also 
existence inside of a model" [1: 36]. Then he raises what he calls the "deeper 
problem" of whether the existence of the tree is in Nature or only in the "obvious 
model" we construct for some convenient purpose. As the paper proceeds, he shows 
that this metaphysical puzzle, which seems to be purely philosophical, is 
fundamentally crucial in the pursuits of AI. 

In Peirce's time (mid-to-late nineteenth century), European academic philosophy 
had framed that problem in terms of Realism vs Idealism. However, by that time, 
Kant's Critique of Pure Reason was beginning to undermine German Idealist 
metaphysics, and the British Nominalists had all but eliminated Realist (Scholastic) 
metaphysics from the domain of philosophical pursuits and relegated it to theology. 
Very few philosophers critically examined these dominant trends (as Peirce did), 
which eventually changed ontological questions into epistemological questions. A 
new academic dichotomy. Empiricism vs Rationalism, converted the fundamental 
philosophical question from "What exists?" to "How can we know anything?" 
Modern philosophy presumed that we can never ask meaningful questions about 
existence itself, and can only know through constructing models or theories of reality- 
-that is, of what appears to be true through observation or rationalization. The 
primary philosophical controversy by the turn of century might be expressed this 
way: Can we build the best models of reality by rational or by empirical methods; or, 
which sort of methods should be foundational? 
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Continuing the nineteenth-century trend, influential philosophers early in the 
twentieth century ignored any notions from traditional metaphysics (as the search for 
certainty and for the "ultimate ground of knowledge"), and were encouraged to do so 
by the rising success of science. "Metaphysical horror," as Kolakowski refers to it, 
began without anyone saying it in so many words: "the very notions of existence and 
reality (unless they are empirically definable, as in the distinction between 
hallucinations and normal perceptions) became useless and misleading" [2: 21-22]. 
These "radical empiricists" were concerned with problems of valid (or consistent) 
model-building. They conveniently assumed the nominalist position on metaphysical 
questions: If what exists is a moot point, we need only a consistent model (or 
representation of knowledge) that conforms to whatever is "objectively experienced," 
as the sciences can observe and interpret it. In mainstream philosophy, epistemology 
subsumed or ignored any concerns that had been considered ontological, or 
metaphysical. In other words, philosophy assumed metaphysics to be irrelevant to the 
modern concerns of logic, mathematics, and science. 

Meanwhile, Peirce (a logician, mathematician, and scientist) insisted, instead, that 
Kant's philosophy had put the final "lock on the door of philosophy” by claiming that 
only deductive reasoning can possibly be valid. He says, "Thus, we seem to be 
driven to this point. On the one hand, no determination of things, no fact, can result 
in the validity of probable argument; nor, on the other hand, is such argument 
reducible to that form which holds good, however the facts may be. This seems very 
much like a reduction to absurdity of the validity of such reasoning; and a paradox of 
the greatest difficulty is presented for solution" [CP: 5.347]. Kant's conclusion had 
implied to his nineteenth-century followers that logic (in its traditional state of 
development) was actually of no use to the dynamic and complex world of scientific 
discovery (or to all probable inference, whether induction or hypothesis; that is, 
inference from the parts to the whole or statistical inference). In response to this 
circumstance, as Tursman explains, "it is no accident that Peirce's system of logic was 
designed to be a theory of scientific discovery . . . [he] regarded every step in science 
as a lesson in logic and after many long and difficult lessons, the observant logician 
learns what logic must be in order to be powerful enough to encompass the domain of 
scientific discovery" [3: xi]. 

From a late twentieth century perspective, Bernstein broadly summarizes Peirce's 
confrontation with modern philosophy and its misunderstanding of scientific inquiry. 

Retrospectively, we can see how Peirce's philosophic inquiries span issues that 
have become central to both Anglo-American and Continental philosophy. As a 
practicing experimental scientist and one of the most creative logicians of his time, 
Peirce forcefully argued that the dominant conception of the nature of science that 
had been shaped by the empiricist and rationalist tradition was grossly inadequate. 
It distorted the true experimental spirit of scientific inquiry. ... He elaborated a 
new image of scientific inquiry, and how it can be understood without indubitable 
foundations. He sought ... a proper understanding of the intersubjective and 
communal dimensions of scientific inquiry. His "image of science"— and indeed all 
inquiry— focused on inquiry as a continuous self-corrective activity governed by 
the norms of a critical community of inquirers. . . . But Peirce was not just a 
philosopher of science and a logician. If we are to gain a genuine understanding of 
scientific inquiry, then it is necessary to probe metaphysical and cosmological 
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issues. Meaning and truth in science must be examined from a more general 
theory of language and signification. [4: xx-xxi] 

Karl-Otto Apel summarizes the implications of Peirce's philosophical view that 
knowledge brings about mediation for a consistent opinion about the real, or the 
'representation' of our outer (hypothetical) states of affairs: 

The characteristic innovation of Peirce's logic of inquiry cannot be regarded as a 
return to metaphysical Realism or Idealism, but rather as a meaning-critical 
postulate in the framework of a semiotic transformation of Kant's "transcendental 
logic." This occurs when Peirce replaces [Kant's] concept of "incognizable things 
in themselves" with the concept of the "infinitely cognizable" and [Kant's] concept 
of the "transcendental subject" of cognition with the concept of the "indefinite 
community" as the subject of the "ultimate opinion." [5: 21-22, ix] 

Metaphysical inquiry has certainly not improved since 1867, when Peirce wrote: 
"In fact the prevalent view of the present day is a heterogeneous hodgepodge of the 
most contradictory theories; its doctrines are borrowed from different philosophers 
while the premisses by which alone those philosophers were able to support their 
doctrines are denied; the theory thus finds itself totally unsupported by the facts and 
in several particulars at war with itself" [CP 7.580] 

One of Peirce's primary objectives was to reinstate existence as an essential 
philosophical (even metaphysical) concept that could help make tractable the many 
traditional puzzles and seemingly irreconcilable oppositions of philosophical 
controversy, with the hope of arresting the endless nominalist debates about the 
superior validity of one or another model. All of his philosophical work might be 
understood as part of this objective, which we can identify as the root of the 
contemporary problems that Kayser examines. Some account of Peirce's philosophy 
will clarify the implications of his contribution for knowledge representation (in the 
broadest sense of that term). On that basis, Kayser's concerns and recommendations 
for conceptual structures development can be specifically considered. 

Peirce conceived a pragmatic theory of inquiry and meaning as an alternative to 
the prevailing philosophical view that abstract terms and concepts must explain 
concrete experience, and Kant's conclusion that the most abstract (or metaphysical) 
concepts are ultimate and unanalyzable [see CP: 5.177, 207, 289, 294, 500ff]. Peirce 
insisted that philosophy, just as any science, 

deals with positive truth, indeed, yet contents itself with observations such as come 
within the range of every man's normal [universal] experience, and for the most 
part in every waking hour of his life. . . These observations escape the untrained 
eye precisely because they permeate our whole lives, just as a man who never 
takes off his blue spectacles soon ceases to see the blue tinge. Evidently, therefore, 
no microscope or sensitive film would be of the least use in this class. The 
observation is observation in a peculiar, yet perfectly legitimate, sense. . . . [and] 
every special science ought to take that little into account before it begins work 
with its microscope, or telescope, or whatever special means of ascertaining truth it 
may be provided with. [CP: 1.241, 1.246] 

The sciences often conveniently ignore puzzles and paradoxes, he says, "and logic 
here seems to touch metaphysics" [CP: 6.182]. 
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All along, in his construction of philosophy as a science, Peirce maintains a self- 
critical stance: "I am free to confess that objections to this way of thinking have 
forced themselves upon me and have been found more formidable the further my 
plummet has been dropped into the abyss of philosophy, and the closer my 
questioning at each new attempt to fathom its depths" [CP: 5.15]. I freely confess 
that the following account only attempts to interpret and reconstruct from the 
fragments that remain of his philosophical work (I limit references to Collected 
Papers, as much as possible, but even these can be misleadingly composed and 
incomplete). I barely skim the surface of both "the abyss" and his systematic effort to 
plummet and pursue a "scientific metaphysics." Please note: I avoid Perice's 
nineteenth-century terminology as much as possible (he says, himself: "the twentieth 
century would laugh at us if we were too squeamish about the word's legitimacy of 
birth" [CP: 2.7, ff pi]). And I do not presume to make more than a suggestion of 
what his philosophy offers. 



3 Peirce's Scientific Metaphysics 

Peirce followed Aristotle and Kant (whom he called "the two greatest of metaphysical 
systematizers" [CP: 2.36]) in struggling to make metaphysics (which he considered 
the philosophy of vagueness [1.204]) logically accessible for examination and use: "it 
is to be assumed that the universe has an explanation, the function of which, like that 
of every logical explanation, is to unify its observed diversity" [CP: 1.487]. He does 
not follow Aristotle and Kant in concluding that logic depends on metaphysics; 
instead, the reverse: "To me, it seems that a metaphysics not founded on the science 
of logic is of all branches of scientific inquiry the most shaky and insecure" [CP: 
1.36]. And contrary to Aristotle and Plato, he places mathematics above logic in 
terms of abstractness [CP: 1.240fn.], joining many of his contemporary British and 
Continental philosophers. Although he expressed concern for the state of 
metaphysics, "a subject much more curious than useful, the knowledge of which, like 
that of a sunken reef, serves chiefly to enable us to keep clear of it” [CP: 5.410], he 
disagreed with the common philosophical opinion of his time: "that Metaphysics is 
backward because it is intrinsically beyond the reach of human cognition," and that 
"its objects are not open to observation." He blamed its neglect in modern philosophy 
on the fact that scientists were for the most part uncritical nominalists [see 8]. His 
philosophy contends that our everyday experience is so saturated with the data of 
metaphysics that we usually pay no attention to them [see CP: 6.2]. Peirce concurred 
with the opinion of several great thinkers that "the only successful mode [of 
metaphysical research] yet lighted upon is that of adopting logic as our metaphysics" 
[CP 7.580]. According to Esposito, Peirce's aim was "to discover a unified theory of 
logic, psychology, and metaphysics, and to present it in some sort of logical form" [6: 
147]. 

Peirce's scheme for classifying the sciences (first constructed in 1896) will serve as 
a sort of reference diagram (see Figure 1). He uses the word "science" in a broader 
sense than we currently do, in English, which is roughly equivalent to the German 
" Wissenschaft . " 
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Science is to mean for us a mode of life whose single animating purpose is to find 
out the real truth, which pursues this purpose by a well-considered method, 
founded on thorough acquaintance with such scientific results already ascertained 
by others as may be available, and which seeks cooperation in the hope that the 
truth may be found, if not by any of the actual inquirers, yet ultimately by those 
who come after them and who shall make use of their results. It makes no 
difference how imperfect a man's knowledge may be, how mixed with error and 
prejudice; from the moment that he engages in an inquiry in the spirit described, 
that which occupies him is science, as the word will here be used. [CP: 7.54]. 

If we view his Classification as a schematic guide to the relations he hypothesizes 
among scientific studies, the structure indicates that each science draws upon the 
principles of the studies above it on the list, mathematics being the simplest and most 
abstract science. Consequently, Peirce says, "We should expect to find metaphysics, 
judging from its position in the scheme of the sciences, to be somewhat more difficult 
than logic, but still on the whole one of the simplest of sciences, as it is one whose 
main principles must be settled before very much progress can be gained either in 
psychics or in physics" [CP: 6.4]. 



The Structure of Philosophy in a Classification of the Sciences of Discovery 

represents philosophy as a positive science, in the sense of discovering what really is 
true but limited to "so much of truth as can be inferred from common experience." 
The special sciences are principally occupied with the accumulation of new facts 
inferred from specific human and natural inquiries [see CP: 1.183-238]. 

I. Mathematics (the Conditional or Hypothetical Science, whose only aim is to 
discover not how things actually are, but how they might be supposed to be, if not in 
our universe, then in some other [CP, 5.40], studies what is and what is not logically 
possible, without making itself responsible for its actual existence) 

II. Philosophy (in Three Branches: Phenomenology ascertains and studies the kinds 
of elements universally present in the phenomenon; meaning by "the phenomenon," 
whatever is present at any time to the mind in any way; Normative science 
distinguishes what ought to be from what ought not to be, and makes many other 
divisions and arrangements subservient to its primary dualistic distinction; 
Metaphysics seeks to give an account of the universe of mind and matter. Normative 
science rests largely on phenomenology and on mathematics; Metaphysics on 
phenomenology and on normative science [CP: 1.186]) 

A. Phenomenology (the science of experience [in terms of Category 
Theory: Firstness, Secondness, Thirdness]. Peirce says: "The most 
fundamental fact about the number three is its generative potency. This is a 
great philosophical truth having its origin and rationale in mathematics. . . . 
I analyze experience, which is the cognitive resultant of our past lives, and 
find in it three elements. I call them Categories. ... I will so far follow 
Hegel as to call this science Phenomenology although I will not restrict it 
to the observation and analysis of experience but extend it to describing all 
the features that are common to whatever is experienced or might 




88 Mary Keeler 



conceivably be experienced or become an object of study in any way direct 
or indirect" [CP: 4.390, 2.84, 5.37]) 

B. The Normative Sciences (the science of the laws of conformity of 
things [as phenomena] to ends, that is, perhaps, to Truth, Right, and 
Beauty" [CP: 5.121]) 

1. Esthetics (the science of ideals; considers those things whose 
ends are to embody qualities of feeling) 

2. Ethics (the theory of self-controlled or deliberate conduct; 
considers those things whose ends lie in action) 

3. Logic (Formal Semiotic, or the science of self-controlled or 
deliberate thought; considers those things whose end is to represent 
something) 

a. Philosophical Grammar (Speculative Grammar, or the 
theory of meaning) 

b. Critical Logic (the theory of inference) 

i. Abduction (logic of hypothesis) 

ii. Induction (logic of sampling) 

iii. Deduction (formal logic) 

c. Philosophical Rhetoric (Speculative Rhetoric, i.e., theory 
of method) 

C. Metaphysics (the science of Reality. Reality consists in regularity. Real 
regularity is active law. Active law is efficient reasonableness, or in other 
words is truly reasonable reasonableness. Reasonable reasonableness is 
Thirdness as Thirdness" [CP: 5.121-129; see also V 6, B 1]) 

1 . Ontology (general metaphysics) 

2. Psychical metaphysics (religion) 

3. Physical Metaphysics or Cosmology (natural laws) 

III. The Special Sciences: these are now represented by the various disciplines in a 
college of arts and sciences, apart from philosophy and mathematics. Peirce divides 
these into what we now think of as the humanities and human studies (including 
psychology and the social sciences) and the natural sciences, which would correspond 
roughly to the distinction between the Geisteswissenschaften and the 
Naturwissenschaften) [see 7]. 



Figure 1. 

Peirce's Classification represents his hypothetical generative relations among the 
sciences. That is, from his logical perspective, the more concrete or special sciences 
would require the results of the more abstract or general sciences as fundamental 
principles. While mathematics and philosophy would make use of the results (or 
selected data) of the former only to support generalizations as needed, he insists that 
their abstractions must, in some manner, give account of themselves in terms of 
concrete experience in human affairs (in demonstrations, dialogue, and other conduct 
in which their indices are indispensable and function as relative pronouns that refer— 
accidentally or indirectly— to existing or possibly existing things [see CP: 2.611]). 
The deplorable state of metaphysics is evidence of insufficient philosophical 
understanding of what he says is the function of this "hypostatic abstraction" in 
normative science (which, for example, tranforms "it is light" into the testable 
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hypothetical, "there is light here" [see CP'. 4.235]) [see CP'. 6.2], Metaphysics, just as 
any scientific inquiry, must insist that all abstractions be interpreted hypothetically, in 
terms of "what they would or might (not actually will) come to in the concrete" [CP: 
6.485]. A highly speculative and fallible science, metaphysics should be cautiously 
undertaken and its results treated as highly provisional [see 7]. 

Existence is the ultimate hypostatic abstraction, the ultimate hypothesized 
condition (that is, the ultimate concept of what experience refers to) that gives 
normative science the concept of truth, and the special sciences the concept of 
objective reality. The postulation of existence gives us the hope that, "if the truth of 
any question can ever be got at, we shall eventually get at it" [CSP-MS 831; 1900] 
and that "the real is that which sooner or later, information and reasoning would 
finally result in, and which is therefore independent of the vagaries of you and me" 
[CP 5.311]. Pragmatism is the method of scientific investigation that reminds us to 
keep our concepts of what is real growing, guided by that hope. Peirce's 
Classification begins to clarify the nature of investigations needed to build a 
scientifically useful philosophy. Based on the phenomenological categories 
(Firstness, Secondness, and Thirdness), his philosophy constructs a normative theory 
of how our experience can continue to grow from instinctive (or metaphysical) 
experience to knowledge through inquiry, or in learning by self-governed conduct (in 
thought and action). 

I hold that we can directly observe [these categories] in the elements of whatever is 
at any time before the mind in any way. They are the being of the qualitative 
possibility, the being of the actual fact, and the being of law [or mediation, or 
regularity] that will govern facts in the future [CP: 1.23]. 

Peirce's metaphysics postulates that these modes of being (or existence, as we 
phenomenally experience it) are all real, in the experiential terms of sensory 
stimulation, intuition or imagination, judgments and beliefs; while the purely 
ontological question of what exists beyond experience, remains forever hypothetical. 
This metaphysical theory asserts that, fundamentally, we know to the extent that our 
experience is constrained by what exists — and that, ultimately, we know to the extent 
of the growth in our experience through our capability to test those constraints, which 
depends on how effectively we can represent them. In the representation of 
constraints, they become cognitive possibilities. Our capability to represent the 
'external' constraints we experience of whatever exists, has evolved to culminate in 
the "scientific method," which Peirce formulates as pragmatism. Without the 
philosophical (or metaphysical) acknowledgment of existential constraints (no matter 
how uncertainly or provisionally we must conceive them), knowledge becomes 
hopelessly relative to individual human tastes (nominalistic), which makes 
communication and the collaborative knowing we call "science" irrelevant and even 
impossible, in any sense that truly meets human needs. 

These logical implications convinced Peirce that metaphysics needed to be 
especially carefully reconstructed, by pragmatic inquiry. Otherwise, uncritically and 
dogmatically adopted beliefs would continue to influence the special sciences: "Find 
a scientific man who proposes to get along without any metaphysics . . . and you have 
found one whose doctrines are thoroughly vitiated by the crude and uncriticized 
metaphysics with which they are packed" [CP: 1.129]. The sciences would remain in 
a state of confusion as to what is real, unconsciously caught in what Hookway calls a 
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" web of words" (that is, words referring to other words as definitions), which is just 
the condition in which we now find much of cognitive science and AI research [8 
:259]. 

Peirce's scientific, metaphysical postulation of existence allows him to develop a 
theory of experience as inquiry that accommodates linguistic and conceptual 
relativity, but not as a fundamental principle to assume; instead "man's reason is allied 
to the originating principle of the universe" [CP: 2.24]. The history of science, 
among other evidence, shows that "man's guesses at the course of nature are more 
often correct than could be otherwise accounted for, while the same facts equally 
prove that this [reasoning instinct] is extremely uncertain and deceptive, and 
consequently unfit to strengthen the principles of logic in any sensible degree" [CP: 
2.25]. Our ultimate concern, then, is not merely to establish consensus that would 
simply resolve diverse opinions, but to reach consensus about interpretations (as 
hypotheses) that could then continue to be tested and modified in further, concerted 
investigation of the (always hypothetical) existential conditions. "There would not be 
any such thing as truth unless there were something which is as it is independently of 
how we may think it to be" [CP: 7.659]. "Truth being, then, the agreement of a 
representation with its object" [CP: 8.3], which we know only approximately, as 
tending to be true. 

In an unpublished manuscript, Peirce clarifies the subtle but crucial distinction he 
draws between reality and existence, in terms of the necessary metaphysical 
relationship we must postulate between representations and their objects, or between 
the model and the modeled. 

By real . I always mean that which is such as it is whatever you or I or any 
generation of men may opine or otherwise think that it is. There must not be any 
confusion between reality and existential,— that is real which is as it is no matter 
what one may think about it . the existential is that which is as it is whatever one 
may think about anything . No doubt there are grades of reality, meaning that 
objects of signs may yield with more or less resistance to opinion or 
representation. According to the definition absolute resistance is essential to 
reality. But an approach to reality, something that is not in the slightest of the 
nature of a pretense is found wherever an object of thought is sufficiently obstinate 
to enable us to say, it has not those characters but it does have these, there is 
already a lesson in logic. Namely, that one may lay down the very best of 
definitions, going to the very heart of things; and yet there will be, as it were, a 
little living mouse of a quasi exception which will find or make a hole to get in 
when all seemed hermetically closed. This mouse will not be a mere pest to be got 
rid of and forgotten. It will be a fellow being to be remembered and to be 
appraised. [MS 498: 32-33] 

In the current terms of AI research, we might consider reality as an evolving 
ontological model of existence, for which pragmatic methods in normative science 
would define the significant contextual relations (or concepts) of particular 
application domains as they inevitably evolve in operation. "The effect of 
pragmatism [as a scientific method of conduct] is simply to open our minds to 
receiving any evidence" [CP: 8.259]; so that, "as we remain disposed to self-criticism 
and to further inquiry, we have in this [pragmatic] disposition an assurance that if the 
truth of any question can ever be got at, we shall eventually get at it" [MS 831: 4-5]. 
In the terms most pertinent to AI, the pragmatic disposition would give assurance that 
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if the definition of any concept can ever be got at, we shall eventually get at it. I will 
hope to clarify these remarks in what follows (see also [9]). 



4 Pragmatic Ontology and Knowledge Representation 

Peirce's scientific philosophy does not rely on a deductive or axiom-theorem model 
[see 6], nor on the inductive or empiricist-positivist model that, in the twentieth 
century, we commonly considered to be scientific. Instead, he introduces pragmatism 
as the method to mediate between such opposing approaches in constructing the 
normative model, or, what I call a "generative model." If we follow his work from 
that late 1890's into the early twentieth century, we find that he refers to pragmatism 
variously as "the logic of abduction" (or of "retroduction," properly translated) [CP: 
5.195-205], "the method of methods" [CP: 7.59], "critical common sensism" [CP: 
5.494], and "the master key to all the secrets of metaphysics" [CP: 5.17]. 

Peirce tells us that the "authors of pragmatism" include the great philosophers of 
Western tradition, and that he only served as the "civil engineer" who subjected its 
elements to tests and constructed it to serve his scientific purpose: 

It is expected to bring to an end those prolonged disputes of philosophers which no 
observations of facts [alone] could settle, and yet in which each side claims to 
prove that the other side is in the wrong. Pragmatism maintains that in those cases 
the disputants must be at cross-purposes. They either attach different meanings to 
words, or else one side or the other (or both) uses a word without any definite 
meaning. What is wanted, therefore, is a method for ascertaining the real meaning 
of any concept, doctrine, proposition, word, or other sign. The object of a sign is 
one thing; its meaning is another. Its object is the thing or occasion, however 
indefinite, to which it is to be applied. Its meaning is the idea which it attaches to 
that object, whether by way of mere supposition, or as a command, or as an 
assertion. [CP: 5.5-7] 

Ontological metaphysics, far from being irrelevant in Peirce's philosophy, 
concerns the most general questions of all science: "What is reality? Are necessity 
and contingency real modes of being? Are the laws of nature real? Can they be 
assumed to be immutable or are they presumably results of evolution? Is there any 
real chance, or departure from real law?" [CP: 5.496]. These are matters that 
underpin our everyday, instinctive or intuitive behavior and unexamined feelings, 
reactions, and beliefs. They are assumptions upon which all our judgments and 
rational conduct depend: "our logically controlled thoughts compose a small part of 
the mind, the mere blossom of a vast complexus, which we may call the instinctive 
mind" [CP: 5.212]. His pragmatic theory of inquiry (or learning by experience) 
begins with the assumption that "All human knowledge, up to the highest flight of 
science, is but the development of our inborn animal instincts" [CP: 2.754, 6.604]. 
And yet, no creature can have instincts for every possible circumstance, and "when 
one's purpose lies in the line of novelty, invention, generalization, theory— in a word, 
improvement of the situation . . . instinct and the rule of thumb manifestly cease to be 
applicable" [CP: 2.178]. 

Human reason exercises instinct and intuition as imagination: "[We] can stare 
stupidly at phenomena; but in the absence of imagination they will not connect 
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themselves together in any rational way" [CP: 1.46]. What I call Peirce's pragmatic 
ontology (or generative study of our experienced relations to existence) conceives the 
evolution of meaning as a "bootstrapping operation" that is not aimless, due to the 
hope which the postulation of existence provides: in scientific reasoning, we begin 
with what we can imagine the truth must be [see CP: 1.46]. Peirce's normative logic 
investigates the intelligent urge and ability to clarify our beliefs and opinions (as 
hypotheses) to make them more effectively testable and comparable with the opinions 
of others, in terms of three modes of scientific reasoning, again, related in generative 
form. 

Abduction is the process of forming an explanatory hypothesis. It is the only 
logical operation which introduces any new idea; . . . Abduction merely suggests 
that something may be. Its only justification is that from its suggestion deduction 
can draw a prediction which can be tested by induction, and that, if we are ever to 
learn anything or to understand phenomena at all, it must be by abduction that this 
is to be brought about. [CP: 5.171] 

The three normative sciences (esthetics, ethics, and logic) correspond in their order 
to Peirce's phenomenal categories (Firstness, Secondness, and Thirdness), and 
pragmatism is the method by which normative science can investigate the data of 
metaphysics. These data "appear in their psychological aspect as Feeling, Reaction, 
and Thought" [CP: 8.256], elements of experience that are too obvious for us to 
notice without the pragmatic method of self-critical investigation. Significantly, he 
contends: "It is not to philosophy only that pragmatism is applicable. I have found it 
of signal service in every branch of science that I have studied. My want of skill in 
practical affairs does not prevent me from perceiving the advantage of being well 
imbued with pragmatism in the conduct of life" [CP: 5.14]. 

In everyday life each of us must assume or believe something (make judgments), 
in order to direct our conduct with respect to whatever exists — to make our actions 
more than simple physical reactions (to mediate our feelings and actions by means of 
inferences about what appears to be true). The urge to reach conclusions, or to "take 
our maps to be the territory" (or take our beliefs to be the truth), is a necessary part of 
effective behavior; but intelligence gives us the capability to believe provisionally 
and, by observation and imagination, to examine the actual and possible outcomes of 
our behavior, self-critically, by as many means as we can create to do so (special 
skills of observation, multiple powers of expression and comparison of these 
observations, and elaborate technological augmentations of these skills and powers 
through media). In science (or any human enterprise conducted as science) we must 
suppose that such "pragmatic mediation" will continue indefinitely, and that our 
concepts, definitions, and ontologies (or representations of any sort) will never be 
complete and precise, but only indicate what is possible evidence to test in further 
experience. 

What I call "pragmatic ontology" would be the hypostatic abstraction of indefinite 
inquiry (that is, what hypothetically represents the real model of existence as the 
hypothetical goal of inquiry; I will hope to clarify this confusing philosophical 
expression in terms of a model methodology!). Accordingly, the method by which 
we pursue inquiry must represent and support the evolution of knowledge in terms of 
regularity, or structural tendencies in the growth of meaning. Peirce's concern about 
nominalism can be crudely expressed as the individual's (possibly instinctive) urge to 
believe that some particular "map of reality," or ontological assumption, might as 
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well be considered to be the correct conceptual construct of "the territory," or what 
exists (since no one can know for sure otherwise), thereby rendering the idea or hope 
of genuine inquiry pointless. In response to this hazardous instinctive conclusion, his 
generative philosophy requires the pragmatic investigation of our reliance on 
representations, by which we continue to increase our knowledge of the regularity of 
any existential conditions. His new question for a scientific philosophy is "How does 
meaning grow?" 

We "block the road of inquiry" (which is the motto Peirce said should be above 
any door of philosophy) to the extent we believe that any abstract structure can finally 
contain meaning. The human predicament is that we must believe we can contain or 
record meaning (in our expressions and thoughts) to some extent, or else we would 
never be able to reason at all — much less conduct ourselves on the basis of our 
thoughts, or express them to conduct ourselves jointly. Pragmatism invites us to 
accept this as an ontological constraint or normative condition of inquiry, but to 
employ it consciously by provisionally using concepts, categories, and defined terms 
to construct as many ontological models as possible, while continuing to examine 
these representations critically, and to test them experientially. Considered as the 
logic of abduction, Peirce says pragmatism should perform two functions: "Namely, it 
ought, in the first place, to give us expeditious riddance of all ideas essentially 
unclear. In the second place, it ought to lend support, and help to render distinct, 
ideas essentially clear, but more or less difficult of apprehension; and in particular, it 
ought to take a satisfactory attitude toward the element of Thirdness" [CP: 5.206]. 

Peirce reminds us: "The nominalists' difficulty ... is their habit of reducing the 
possible to the actual and of not distinguishing the actual [or our immediate sense of 
the existential] from the real [or our mediated experience of the actual as possible]" 
[CP: 1.422]. The nominalist assumption that the identity of the real is merely a 
matter of representation, which is ultimately undefinable, inverts his phenomenal 
Categories (Thirdness, Secondness, Firstness) to claim that abstractions explain 
concrete experience. What I interpret as "pragmatic ontology" contends that 
representations must continue to give an account of themselves, in tests of concrete 
experience, if we want reliable knowledge (not merely what purports to be logically 
valid knowledge). Only simple qualities of sense or feeling, or blind reactions 
between these, can be indefinable; concepts are relational and definable, but never 
finally and absolutely [see CP: 8.305]. He contends that Reality is a conception of 
representation as representation, that is, representation as the mediation between 
actuality and possibility in experience). "To be a nominalist consists in the 
undeveloped state in one's mind of the apprehension of Thirdness as Thirdness [or 
representation as representation]. The remedy for it consists in allowing ideas of 
human life to play a greater part in one's philosophy" [CP: 5.121-129]. In response to 
what Kayser identifies as AI's "ontology problem," this remedy calls on AI research 
to question its assumptions about concepts, relations, and other "entities of concern" 
in building efficient ontological models. 
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5 Pragmatism and Ontological Problems 
in Conceptual Structuring 

Peirce's philosophy implies the need for pragmatically evolving models of ontology 
in scientific investigation of metaphysical assumptions, to rescue AI research from 
the nominalist's problem of trying to ignore metaphysical issues. Peirce even 
encourages: "If the idealist school will add to their superior earnestness the diligence 
of the mathematician about details, one will be glad to hope that it may be they who 
shall make metaphysics one of the true sciences" [CP: 8.118]. Kayser invites such 
investigation in the last line of his paper: "This idea [of conceptual adaptation] 
provides by itself no methodology; at least, endorsing it means to stop looking for 
oversimplistic solutions where they obviously do not work" [1:47]. We can now 
interpret Kayser's main thesis as a metaphysical problem made puzzling by AI's 
unconscious adoption of an idealistic metaphysics, for which Kayser himself suggests 
the pragmatic solution: 

[T]he inadequacy of current AI ontologies— as soon as they concern a domain 
dealing with concepts which are not perfectly defined logically— originates in an 
idealistic view which does not meet AI needs. This view should be replaced by a 
much more flexible one, taking into account variability, i.e, the fact that 
ontological uniqueness is a wrong requirement for AI, and borrowing from the 
linguistic processes by which the meaning of a word gets adapted to its context of 
use [1: 36-37]. 

More specifically, Kayser says, "AI has completely neglected that what we choose 
concepts to refer to must be revisible," and he suggests the idea of "variable 
ontology" to respond to the "real world" circumstance of our changing conceptual 
identifications. He tentatively concludes: "The only consequence that we draw for 
now is that the solution of an 'ontological problem' should not be a structure such that 
the concepts of the application rigidly map to a unique entity of the [conceptual] 
structure. The degrees of freedom that are needed should answer the problems of 
vagueness or fuzziness as well as the problem of the identity of objects" [1: 39]. 

Kayser gives examples of how difficult it is to use language in structures that are 
'ontologically correct,' even for specific application contexts. If we arbitrarily assign 
a "basic" concept to an object by means of a term, he explains, then "nearly every 
occurrence of the term must be treated as a metonymy, that our knowledge of the 
usual situations [as the experience of interactions with the object] allows us to decode 
more or less accurately, in order to get an expression using the 'basic' meaning" [43]. 
He reviews the difficulties to be acknowledged: 

- a choice of the "basis" [or basic meaning], for which no solid ground seems 
available, 

- a tedious, sometimes reckless, translation of the occurrences of the term, 
resulting in 

- an overall impression of artificiality and inadequacy of the conceptual 
representation. [1: 43] 

These specific difficulties can be construed as the same problems for which Peirce 
urged the application of pragmatism to build a scientific philosophy. Rather than 
looking for 'basic meaning,' AI needs to be able to identify what meaning is unclear in 
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any particular application context, and either eliminate the term and concept that 
represent it, or render the meaning distinct by considering (imagining what would be) 
the consequences of accepting specific definitions, in the conduct of specific human 
affairs. Meaning should never be assumed to reside in the terms, definitions, 
relations, or conceptual representations of any sort; they are only the medium [or 
Thirdness] by which meaning continues to grow. Real meaning is ascertained by 
experimentation in real contexts of actual (or existential) operation. Pragmatism is 
"no metaphysical doctrine, no attempt to determine any truth of things," says Peirce; 
this "method of ascertaining the meanings of words and concepts is no other than that 
experimental method by which all the successful sciences . . . have reached the 
degrees of certainty that are severally proper to them today; this experimental method 
being itself nothing but a particular application of an older logical rule, 'By their fruits 
ye shall know them'" [CP: 5.464, 465]. The method prescribed "is to trace out in the 
imagination the conceivable practical consequences, — that is, the consequences for 
deliberate, self-controlled conduct, — of the affirmation or denial of the concept; and . 

. . herein lies the whole of the purport of the word, the entire concept" [CP: 8.191]. 

Although Kayser agrees that the question of how KR researchers choose their 
entities of concern is an ontological question, he concludes that traditional philosophy 
offers no ontological help to KR researchers in choosing entities of concern to build 
representational structures [1: 35]. "AI wants to find the basis for efficient models; 
philosophy wants to discuss the problem of real existence; their goals are not 
compatible and their methods are completely contradictory." [1:35-36]. 

Peirce conceived pragmatism to reconcile the different goals (in terms of 
nominalist and realist). The efficiency of intelligence in establishing reliable habits 
(models, beliefs, judgments, definitions, conduct, and so forth) depends on our 
capability to conceive past results and postulate future outcomes (possible causes and 
effects), through communication (with oneself and in cooperation with others) in 
inquiry. Genuine knowledge progresses to the extent that we are capable of 
representing our observations effectively (as a basis for further observation), and can 
continue to develop this capability by means of tools to serve that purpose. Pragmatic 
methodology is conceived to maintain more valid (internally consistent) and reliable 
(externally responsible) evolution of representation. Efficiency entails real 
effectiveness; validity entails existential reliability. Efficient models are models of 
reality, in Peirce's terms. 

Peirce's ontology claims that the human mind must have been attuned to the truth 
of things in order for science to discover what it has discovered. Instinct is the very 
bedrock of logical truth [CP: 6.476], on which all reasoning (as regularity in thought) 
must be built [CP: 6.500]. However, "an absolutely determinate term cannot be 
realized, because, not being given by sense, such a concept would have to be formed 
by synthesis, and there would be no end to the synthesis because there is no limit to 
the number of possible predicates. A logical atom, then, like a point in space, would 
involve for its precise determination an endless process. We can only say, in a general 
way, that a term, however determinate, may be made more determinate still, but not 
that it can be made absolutely determinate" [CP: 3.93]. 

Kayser pinpoints the problems of object identity ("extensional" view of what 
constitutes a concept) and the problem of the inevitable vagueness of concept 
definition ("intentional" view of what constitutes a concept). "What we consider in 
the real world to be an object tends to be much more problematic than most of us are 
ready to admit [and] language adds a lot of difficulties that we cannot ignore" [1: 38]. 
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He considers the difficulties exhibited in the Generative Lexicon a hint as to the 
pointlessness of keeping concepts pure from any linguistic contamination [see 1:43- 
44], 

Pragmatic ontology would analyze these metaphysical difficulties in 
phenomenological terms, as Peirce suggests in the following excerpt, and normative 
science would investigate the implications for any special scientific purpose, such as 
AI's. 

The mind, by its instinctive adaptation to the Outer World, represents things as 
being in space, which is its intuitive representation of the clustering of reactions. 
What we call a Thing is a cluster or habit of reactions, or, to use a more familiar 
phrase, is a centre of forces. In consequence, of this double mode of association of 
ideas, when man comes to form a language, he makes words of two classes, words 
which denominate things, which things he identifies by the clustering of their 
reactions, and such words are proper names, and words which signify, or mean, 
qualities, which are composite photographs of ideas of feelings, and such words 
are verbs or portions of verbs, such as are adjectives, common nouns, etc. [CP, 
4.157]. 

Finally, Kayser suggests the possible solutions of semantic adaptation and 
conceptual adaptation, based on awareness that a word can take virtually an unlimited 
set of meanings in various contexts, and that dictionaries keep only the most typical 
semantic values (a seldom acknowledged limitation to text understanding). He 
speculates that concept adaptation may be as necessary in conceptual structuring as 
semantic adaptation is inescapable for text understanding, implying the need to 
employ multiple conceptual structures, without being misled by identity of concept 
names. He suggests the idea of introducing levels, with stable concepts being 
structured for each domain of interest at different levels (levels could be discarded 
when modification makes no sense or regular patterns give meaningful results). He 
hypothesizes that conceptual adaptation would be needed to cope with ontological 
multiplicity, if we want tools truly adapted to real life domains; then he deduces. 

More precisely, I argue that any formula (F) representing some piece of knowledge 
(or some questions to be answered, e.g. If the formula has free variables) in a 
conceptual language should not be matched with (for consistency checking) or fed 
into (for inference) a single conceptual structure, identifying the concept names 
appearing in the formula with those occurring in the pre-existing structure. Instead 
of that, the formula should rather be confronted with several such structures, 
without being misled by identity of concept names. In each case, a process of 
conceptual adaptation should be invoked, in order to discard the levels where (F) 
makes no sense, and to modify (F), following regular patterns, in order to get from 
it meaningful results. [1: 45-46] 



6 Conclusions 

I argue that Peirce's philosophy would supply the theoretical and methodological 
guidance needed to carry out Kayser's suggestions (or abductions, expressed above), 
pragmatically: 
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A hypothesis must explain the phenomena in question. An [abductive] analysis of 
its logical purport, of its would-bes, allows an inquirer to determine this. 
Deduction then develops the implications of the would-bes, and induction tests for 
the reality of the generality that is the hypothesis or, more accurately, the object of 
the hypothesis, and thus "gives us the only approach to certainty concerning the 
real that we can have" [CP: 8.209]. 

Elsewhere [10], I have suggested what is called the "collaboratory testbed model" 
(currently being advanced in scientific research) as the context in which KR research 
and development could be conducted pragmatically. In that context, technological 
innovation evolves by means of "partnerships" involving those who need technology 
to augment their collaborative scientific inquiry and those who build augmentation 
technology to support that work. The evolution of such systems could be conceived 
in terms of pragmatic ontology. I have suggested that the communications 
infrastructure for the testbed methodology could be advanced by integrating KR tools 
for that purpose. Conceptual Graphs and Formal Concept Analysis employed in 
various modes of communication (both normative and algorithmic) could create an 
effective medium for the conceptual structuring that collaboration requires. Many 
Conceptual Structures tools already exhibit promise of that potential and only need a 
collaboratory research program, in which they could be considered tests of Peirce's 
scientific philosophy— whether AI researchers care about the metaphysical status of 
what they are studying or not! The Sisyphus testbeds [10] for conceptual structures 
research indicate the rationale and benefits of a pragmatic research methodology; and 
I have suggested that the PORT (Peirce Online Resource Testbeds [11]) project 
would be a significant context for developing, evaluating, and continuing to improve 
KR tools by means of, and for further collaboratory infrastructure advancements. In 
that research and development context, application systems would be treated as 
theoretical hypotheses to be tested and modified for continuing improvement. Tools 
would be embodied hypotheses of what is required for specific purposes in specific 
contexts of operation [see 12]. Pragmatic ontology would need an evolving 
communication medium, in which truly challenging puzzles could be investigated. 
Here are two (which Peirce says would require his existential graphs): 

They are the puzzle of the relation of signs to minds, and of their communication 
from one mind to another, and the puzzle of the composition of concepts and the 
nature of the judgment, or, as we of the antipsychological school say, of the 
proposition [MS 498: 29; an unpublished MS, entitled by Peirce, “On Existential 
Graphs as an Instrument of Logical Research”] 

In our development of media for representation and communication, particularly 
technology for the progressively more efficient production of written language, we 
have been led to believe that we can contain meaning, or record it in permanent, 
identically reproducible representational (and idealized) structures, such as the 
"correct ontologies" that Kayser says are too simplistic. But we now have an 
electronic medium of representation with which we may hope instead to represent the 
evolution of meaning, and to realize that the structures we create are not ends 
(ontologies) but means (media) for continuing the growth of experience and 
knowledge. We no longer need to identify stability of meaning with the ideal of 
establishing absolute definitions that we imagine to be consistently and perfectly 
interpreted. Metaphysically observed, what we think of as "the stability of 
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phenomena" is regularity, or tendency in growth of meaning. Crucial for any 
Conceptual Structures project is to build an operation that will support continuing 
conceptual growth. Peirce's pragmatic philosophy does not simply encourage us to 
develop yet another conventional mode of analyzing media and expressions, such as 
formal logic serves for linguistic instances, as though they were existential 
conditions — timeless "facts," or the ontological foundations of traditional philosophy. 
Instead, he developed his philosophy to investigate and discover how concepts (along 
with the media and modes we use to express them) evolve: an evolution that will 
continue, whether theoretical logic and science help us comprehend it or not [see 13]. 

AI ontologies are experimental representations of reality, based on tentatively 
defined relations as evidence of what might possibly exist. We cannot expect that 
medium of representation to give us a reasonable chance of knowing and controlling 
actual conditions, validly and reliably, without metaphysical postulation and 
normative testing in their evolution. 

Memory supplies us a knowledge of the past by a sort of brute force, a quite 
binary action, without any reasoning. But all our knowledge of the future is 
obtained through the medium of something else. To say that the future does not 
influence the present is untenable doctrine. It is as much as to say that there are no 
final causes, or ends. The organic world is full of refutations of that position. Such 
action [by final causation] constitutes evolution. But it is true that the future does 
not influence the present in the direct, dualistic, way in which the past influences 
the present. A machinery, a medium, is required. Yet what kind of machinery can 
it be? Can the future affect the past by any machinery which does not again itself 
involve some action of the future on the past? All our knowledge of the laws of 
nature is analogous to knowledge of the future, inasmuch as there is no direct way 
in which the laws can become known to us. We here proceed by experimentation. 
That is to say, we guess out the laws bit by bit. We ask. What if we were to vary 
our procedure a little? Would the result be the same? We try it. If we are on the 
wrong track, an emphatic negative soon gets put upon the guess, and so our 
conceptions gradually get nearer and nearer right. The improvements of our 
inventions are made in the same manner. The theory of natural selection is that 
nature proceeds by similar experimentation to adapt a stock of animals or plants 
precisely to its environment, and to keep it in adaptation to the slowly changing 
environment. But every such procedure, whether it be that of the human mind or 
that of the organic species, supposes that effects will follow causes on a principle 
to which the guesses shall have some degree of analogy, and a principle not 
changing too rapidly. [CP: 2.86]. 

By investigating fragmentary manuscript evidence from the past, we might hope to 
make Peirce's existential abduction Pragmatically Ours, for the future of knowledge 
representation. "Indeed, it is the reality of some possibilities that pragmaticism is 
most concerned to insist upon" {CP: 5.453]. 
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Abstract. As ontologies become common in more applications and as 
those applications become larger and longer-lived, it is becoming in- 
creasingly common for ontologies to be developed in distributed envi- 
ronments by authors with disparate backgrounds. Ontologies that are 
expected to be collaboratively created and maintained over time by 
authors in many locations present special challenges to the problem of 
conceptual modeling. In this paper, we will discuss conceptual model- 
ing issues and focus on those topics with elevated importance in dis- 
tributed environments. We will draw on our experience creating and 
maintaining ontologies in differing knowledge representation and rea- 
soning environments over the last decade. Many of our recent obser- 
vations are drawn from our experiences in the DARPA High Perform- 
ance Knowledge Base Program. This program generated dozens of 
knowledge bases authored by people of varying expertise in both 
knowledge representation and reasoning as well as domain experience. 
Our efforts in merging the ontologies, loading them for coordinated 
use, and modifying them to meet evolving needs shape much of the 
material in this paper. Additional sources of observations are from de- 
signing and building a number of e-commerce ontologies (with content 
merged from multiple sources) and also from a few families of descrip- 
tion logic applications including the PROSE/QUESTAR family of con- 
figurators and the EindUR knowledge-enhanced search applications. 



1 Introduction 

In recent years ontologies have become the subject of interest in communities beyond 
just those of knowledge representation and library science. They moved out of the 
research labs and are included in most expert system applications in the 80s and 90s 
and many of them supported sophisticated inference as well as retrieval. Many times 
these ontologies were built by people highly trained in knowledge representation and 
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reasoning. Many times the author(s) also became highly literate in the domain as 
well. Although challenging, conceptual modeling for these kinds of situations has 
been studied and literature exists on proposed strategies for conceptual modeling for 
projects in which the knowledge acquisition is a fairly centralized process. 

Taxonomies, sometimes faceted, have also been the domain of library science for a 
number of years. They have been successfully developed and maintained by trained 
staff. One of the more famous examples is the Dewey Decimal System, which is still 
in use today over a century after it was conceived, and has an active staff associated 
with it [Vizine-Goetz and Mitchell, 1996]. We will not focus in this area where 
trained library scientists do their classification work but note that just as the knowl- 
edge representation community is now reaching out to the non-trained public, the 
library science community is also reaching out to “non-catalogers” (such as the work 
by the Dublin core). Arguably, much of this is driven by the ubiquity of the web 
and efforts such as the resource description framework (RDF) [Brickley and Guha, 
2000]. 

Recently large ontologies have become more common in broad consumer applica- 
tions ranging from search (such as Yahoo, Lycos, etc.), to e-commerce and auctions 
(such as Amazon, EBay, etc.), to configuration (such as Dell, PC-Order, etc.), to more 
general information sites (such as cnet.com). Sometimes the larger ontologies are so 
broad that, almost by their nature, they are best designed and maintained in a distrib- 
uted manner by multiple experts. Sometimes the vocabularies are just simple tax- 
onomies, but many recent background knowledge organizations present structure in 
the form of properties (such as wine properties in Virtual Vineyards or electrical 
component properties in specification search on necx.com). It is this class of ontolo- 
gies that we will be considering in this paper - those where one expects the ontolo- 
gies to be created and maintained by a staff and possibly not be highly coordinated in 
its evolution. We will include both the simple concept taxonomies as well as ontolo- 
gies with structure and many inter-relationships. 

One motivational model is the Open Directory project by DMOZ where the goal is 
to build an “Internet brain” by leveraging the expertise of many experts. (At publi- 
cation time, there were over 25,000 registered experts, which is up 25% in the last 
four months, over 250,000 categories, and over 1.75 million sites.) This may be at 
the extreme end of the spectrum since the ontology is fairly simplistic in nature and 
enormous in scope, but there are a number of e-commerce and other ontology efforts 
that have “cybrarian or ontologist” staffs in the dozens and content areas that are very 
broad. It is our speculation from empirical observation that ontology staffs will con- 
tinue to grow, the training of entry-level staff will broaden, and ontologies will con- 
tinue to become more ubiquitous. Given these speculations, the role for ontology 
environments in distributed settings is becoming more critical. 



1.1 Three Motivational Application Areas 

Since we believe that expected and actual ontology usage heavily impacts resulting 
ontology design, we feel it is instructive to provide a context from which to judge our 
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observations and conclusions. This paper draws largely on three somewhat distinct 
ontology efforts including a decade of description logic applications (and description 
logic system and environment design and development), co-directing the Stanford 
University Knowledge Systems laboratory’s high performance knowledge base pro- 
gram (and its frame system environment design and development), and commercial 
consulting in ontologies. We begin, of course, with rather extensive training in 
knowledge representation and reasoning. 

The description logic applications fall broadly into a family of configurators (used 
by AT&T and Lucent to configure transmission equipment [Wright, et. al, 1993, 
McGuinness and Wright, 1998a, 1998b], McGuinness, et. al, 1995]), data mining 
applications [Selfridge, et. al., 1992], (a successor application was marketed by 
NCR), and ontology-enhanced search applications and environments (used in online 
calendars [Fuoss, et. al], electronic yellow-pages [McGuinness, 1998], AT&T World- 
net, and medical applications [McGuinness, 1999a]). Additionally of course, there 
was active development on the representation and reasoning system [Borgida, et. al, 
1989, Brachman, et. al, 1990] and the supporting environment for use which led to 
work on pruning languages [McGuinness, 1996, Baader, et. al, 1999, Borgida and 
McGuinness, 1996], explaining reasoning systems [McGuinness, 1996, McGuinness 
and Borgida, 1995, Borgida, et. al, 1998], and usability and environment work 
[McGuinness and Patel-Schneider, 1999, Brachman, et. al., 1999]. 

These applications typically had fairly intricate structure of the underlying repre- 
sentation, and at least in the configurators, had extensive use of the reasoner. Their 
representation in a description logic provided the opportunity for knowledge engi- 
neers to input term definitions with precise semantics. Most of the knowledge engi- 
neering was done in a structured and “project-managed” fashion so that it was possi- 
ble to train people about how to build knowledge bases and how to extend them. 
Environments were built that supported knowledge engineers in building their own 
new applications so that knowledge representation experts were not required to do 
much work when new configurators were added (nor when new content areas were 
added within the framework of the knowledge-enhanced search applications). The 
data mining applications also had fairly intricate and inter-related object structure. 
The object structure was used to connect (and essentially to join) a number of legacy 
database systems. The inference level was not as extensive as that of the configurator 
applications. The main goal was to provide data analysts with natural and modular 
access to the data to support their quest for finding patterns in the data. The knowl- 
edge-enhanced search applications had the simplest modeling and reasoning structure. 
They were initially motivated by AT&T’s Personal Online Services’ browsing and 
search needs. They were expected to be used by a very broad user base. The appli- 
cation scope expanded to include more targeted user populations in narrower domains 
as well as broad user communities in wide domains. These applications hold the 
greatest similarities to the ontologies we observed in recent commercial ontology 
applications 

The work on the High Performance Knowledge Base program [Cohen, et. al, 1998, 
MacGregor and McGuinness, 1999] used the Stanford ontology environment includ- 
ing Ontolingua [Farquhar, et. al., 1996], OKBC [Chaudhri, et. al, 1998], KIF [Gene- 
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sereth and Pikes, 1992], and Chimaera [McGuinness, et. al, 2000a, 2000b] to build 
and maintain many large knowledge bases for intelligence analyst usage. It had a 
much more distributed nature than the work previously mentioned. The source 
documents appeared to he generated by multiple experts (usually not trained in 
knowledge representation). The knowledge engineers were distributed throughout 
the country and did not always have a lot of interactions with each other or with the 
original authors of the source documents. The knowledge representation experts 
were also distributed and sometimes time pressure did not allow them to coordinate 
their conceptual models as much as they would like. Additionally, ontologies were 
many times written by a single author or group for one purpose and then picked up by 
other groups for other purposes in the same content area. 

The ontologies typically had fairly intricate structure both in terms of including 
objects with dozens of properties and many inter-relationships and also in terms of 
having fairly deep hierarchies. The controlled vocabularies of the upper level ontolo- 
gies had significant size as well. Inference was sometimes complicated and also 
many times required common sense reasoning in broad areas (such as geographic 
reasoning over much of the world). Thus, these ontologies required breadth and 
depth in most dimensions that one would usually consider when measuring ontology 
complexity. 

The work consulting on commercial ontologies [McGuinness, 1999b] is less aca- 
demic but has been quickly deployed and thus has more recent empirical basis. The 
time frames common in startup applications allow (and force) quick deployment and 
evaluation of ideas. The ontologies were sometimes designed on paper and devel- 
oped in house using internal (sometimes proprietary tools), sometimes designed and 
implemented using a combination of Stanford tools and CLASSIC and sometimes 
handed over to be maintained internally, and sometimes maintained through consult- 
ing. The ontologies for consulting purposes tended to contain less intricate structure 
and less nesting but sometimes had significant depth, breadth, and total size. In a 
number of commercial ontologies, it was important to include existing “standard” 
published vocabularies as a portion of the final ontology. 



2 Distributed Ontology Desiderata 

We will now present some general guidelines abstracted from the previously men- 
tioned ontology experiences. For every principle, we will attempt to motivate its 
need and benefits. We will also include a discussion of some of the related chal- 
lenges and include a list of operational guidelines. 



2.1 Incorporate Standard Concept Vocabularies (and Capture Their Semantics) 

A goal is to include industry standard vocabularies that are familiar to the major 
classes of users of the system. Assuming the ontology will evolve, those classes of 
users include both the knowledge engineers as well as the users of the application. 
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While it is important in centralized ontologies to use standard terms, it becomes more 
critical as ontologies become larger, more difficult to browse, more difficult to re- 
member, and more complicated to extend and maintain. 

Essentially every current generation set of principles for building an ontology in- 
cludes articulating the expected uses of the ontology. We would add to this, the no- 
tion of articulating the expected user profiles (including both knowledge engineers 
assuming the ontology will evolve and also the application users). In the user articu- 
lation, there should be some characterization of what (if any) controlled vocabularies 
are expected to be familiar. Also, information about the expected ambiguity of an 
ontology should be included. For example, do the vocabularies provide necessary 
and sufficient conditions for membership in a class? Do they provide examples of 
class membership (and possibly examples of non-membership)? Are the ontologies 
just hierarchies of potentially ambiguous terms? Additionally, it should be noted if 
incoming data is likely to be consistent (or tagged) with a particular vocabulary. For 
example, if one is beginning a new e-commerce business-to-business application, one 
should consider how emerging standards like the UN/SPSC might interact with the 
resulting ontology. If, for example, any major sources of input data are expected to 
use a controlled vocabulary, then knowledge engineers will have an easier job if the 
application ontology is either compatible with or already includes the controlled vo- 
cabulary. Similarly, if end users are expected to be in environments where they are 
using terms from a controlled vocabulary, then they will have more intuitive search 
and browsing experiences with the application if those terms are recognized by the 
new application ontology. 

If the application ontology contains all of the notions included in a user’s vocabu- 
lary yet it uses different term names, then it will slow the user down and sometimes 
will make the user ineffective. For example, if the user expects to see “car” in the 
ontology and thus does a search for the string “car”, he or she will fail to find cars in 
an ontology that uses only the terms “vehicle” and “auto” (if a standard syntactic 
retrieval is performed). There are a number of coping strategies for multiple names 
for the same notion such as thesaurus incorporation, query expansion, browsing sup- 
port, translators, etc., but before they all end up as requirements to the system design, 
it is worth determining if a different base ontology would help solve a number of 
problems. In a growing number of areas, there appear to be some well-designed 
candidate ontologies that are becoming better supported and accepted. Examples 
include SNOMED and UMFS in medicine, UN/SPSC in e-commerce, etc. 

Another issue related to standard vocabularies is that not only are they emerging, 
many different vocabularies are emerging as viable options. As the field of common 
ontologies sorts itself out, one can expect to feel compelling needs to support com- 
patibility with multiple controlled vocabularies. This will lead to another principle 
discussed below concerning environmental support for interacting with and merging 
multiple ontologies with systematic environmental support tools. 

As ontologies become larger, browsing becomes more difficult. For example, in 
the high performance knowledge base program upper level ontology, there exist thou- 
sands of terms and it is not always easy to find the term for which one is looking. 
Even after some “rationalization”, undoubtedly, there are remaining issues in terms of 
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alignment that should be handled. In our merging effort (required to load many inde- 
pendently developed ontologies), we discovered a number of missing sub-class links. 
For example, we found that “strait” was defined to be a “body-of-water” and that 
“narrow-body-of-water” was also defined to be a sub-class of “body-of-water”, yet 
there was no sub-class relationship between “strait” and “narrow-body-of-water”. 
We hypothesized, and later confirmed, that what was apparently a missed sub-class 
relationship was due to the fact that the author of the “strait” term was unaware of the 
previously defined term “narrow-body-of-water”. 

In order to solve this problem, one cannot do simple synonym matching. Here an 
explicit representation of the semantics of the terms would be useful. If one had a 
precise definition, say that narrow bodies of water have a width of less than two miles 
and straits have widths of less than a mile, then in a classification system such as a 
description logic, a reasoner could automatically recognize that one is a sub-class of 
another. Still, even without precise definitions and classifiers, additional support can 
be provided. For example, in the initial (less specific) encoding of strait as a “body- 
of-water” (but not a “narrow-body-of-water”), a sibling analysis would note that 
“strait” is a sibling to “narrow-body-of-water” since they are both direct sub-classes 
of “body-of-water”. Sibling analysis would then ask if the new term - “strait”- can be 
made a sub-class (or super-class) of its current siblings and also if it is of the same 
level of specificity as its siblings. 

Additionally, some representation languages allow knowledge engineers to repre- 
sent classes that are disjoint from each other. Such classes may be considered mem- 
bers of a disjoint partition. In ontologies that contain partition or covering informa- 
tion, questions should be asked as to whether the new class should be added to any 
partitions under its parent class. For example, consider an ontology that contains a 
partition under the class “weapon” that distinguishes between the disjoint classes 
“biological-weapon” and “chemical-weapon”. An addition of a new sub-class of 
weapon, say “nuclear-weapon” should trigger questions such as 

Should “nuclear-weapon” be either a super-class or sub-class of “chemical- 
weapon” or biological- weapon”. 

If “nuclear-weapon” is accurately a direct sub-class of weapon, then is it dis- 
joint from the other sub-classes of weapon, and should it be added to the 
weapon class partition. 

Some operational guidelines include: 

Articulate anticipated ontology usage as well as expected user profiles. 

Use a controlled vocabulary that is familiar to users if one exists. 

Specify mappings between multiple standard controlled vocabularies if multi- 
ple vocabularies are standard in the domain(s) of interest. 

Allow for user extensibility of the mappings (thus, support users in adding 
new synonyms into a thesaurus). 

Allow for controlled vocabulary extensions. 

Provide supporting mechanisms such as query expansion in search to help fa- 
cilitate sub-class matches. 

Specify semantics of terms. 
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Provide semantic retrieval instead of just syntactic retrieval. 

Provide additional structural retrieval methods such as sibling retrieval and 
analysis. 

Provide partition extensibility. 



2.2 Incorporate Standard Property Vocabularies 
(and Capture Their Semantics) 

As we mentioned in the introduction, it is becoming increasingly common for web 
ontologies to include inter-relationships. There has been more effort dedicated to 
developing standard hierarchies of classes (or noun phrases) than there has been on 
generating standard groupings (or hierarchies) of properties (or relationships between 
objects). Still, there are a number of organizations such as NIST, UN (in the 
UN/SPSC effort), NLM, etc., which are attempting to generate standard vocabularies 
of property names. Thus, we observe that even today a number of standard role vo- 
cabularies exist and we speculate that their growth will accelerate. 

We also note that just as there are issues with retrieving classes from large and un- 
familiar ontologies, there are also (at least as many) issues with finding role names in 
the same ontologies. All of the same problems exist and there may be less synonym 
support for simple synonym matching or query expansion. We have observed many 
times that role proliferation can quickly become a problem in large knowledge bases 
and we speculate that one major reason is because knowledge engineers are not find- 
ing existing useful roles. For example, a knowledge engineer who does not know that 
shipping-weight is already in the ontology may add a new role called weight. If the 
knowledge engineer did not know that shipping-weight existed, then there would be 
no connection between weight and shipping-weight. Now both terms will be in the 
ontology and with time, one could expect that some knowledge engineers find weight 
and thus fill in that role (without filling in shipping-weight) and other knowledge 
engineers will find shipping-weight and fill in that role (without filling in weight) and 
some may manage to fill both in (possibly inconsistently). Thus, some objects will 
have weights but not shipping weights and vice-versa. Possibly, even more problem- 
atically, there will be no enforceable constraints between the two roles. (Using com- 
monsense, a shipping-weight would be expected to be equal to or slightly higher than 
the actual product weight. So if an object had an actual weight of 3 pounds and a 
shipping weight of one pound, then there would seem to be parts missing and thus an 
inconsistency.) If there is no relationship between the roles, this inconsistency could 
not be detected and we would also have no way of letting the system do some work of 
deducing fillers or at least lower bounds on fillers for object shipping-weights (or 
possibly estimates of actual-weights) for users. 

When we reviewed a number of knowledge bases that contained large numbers of 
concepts and roles, we found this role proliferation and lack of connection between 
related roles to be extremely common. Another connection between roles that was 
missed routinely was an inverse relationship. For example, one source of data may 
include people, countries, and information about which people are leaders of coun- 
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tries. Thus, an ontology might include a property named “leader-of-country” and in 
the case of the individual “Clinton”, “leader-of-country” would be filled with 
“United-States”. Similarly another data source may include the property “has-leader” 
that has a domain including countries or possibly even more specifically a property 
with the singular domain of countries that might be called “country-has-leader”. If 
there were no inverse relationship stated between “leader-has-country” and “country- 
has-leader” then it would appear that an application did not know who the filler of 
“country-has-leader” is on the United States even though it knows that United States 
fills the role “leader-has-country” on Clinton. In our analysis of a number of knowl- 
edge bases designed for simpler purposes and later extended for broader usage, we 
found this kind of role inconsistency and lack of inverse relationships to be quite 
common. 

We also found that objects can quickly gather many properties in ontologies but 
many times only a few of the properties are useful for a particular purpose. It is not 
uncommon in our configuration examples and also in our data mining examples, for 
objects to have hundreds of properties. It is also not uncommon for the user to be 
only interested in a few property values. Large and complicated objects typically 
require some sort of pruning mechanism to help users focus their attention on the 
aspects of the object presentation that is relevant to them. Some knowledge repre- 
sentation systems like CLASSIC[Brachman, et. ak, 1991] include a compositional 
language extension that allows knowledge engineers and users to specify (context- 
sensitive) matching patterns to be used when displaying or explaining objects. Sim- 
pler approaches include simple markup languages or just flags that tell the system if it 
should display portions of objects. 

Some operational guidelines include: 

Use a controlled vocabulary that is familiar to users if one exists. 

Specify domains and ranges of roles (for example, the domain of “leader-of- 
country” is “person” and the range is “country”). 

Specify inverse relationships between roles (for example, “leader-of-country” 
is the inverse of “country-has-leader”). 

Specify active inferences to infer constraints between role values (for exam- 
ple, “shipping-weight” is greater or equal to the “actual- weight”). 

Specify conversion rules for presenting different views of fillers for properties 
(for example, provide a rule that can calculate a filler for the price role in 
German Marks if a price filler is known in United States Dollars and a multi- 
plier is available). 

Provide some sort of markup language to allow knowledge engineers and us- 
ers to prune out roles for certain presentation (and explanation) views. 



2.3 Utilize (or Develop) Environments to Support Ontology Evolution 

As ontologies become larger and more structure is put in place, it becomes increas- 
ingly important to use tool environments to enforce consistency and also to aid users 
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in focusing their attention in the areas where they are needed to resolve problems. It 
seems to be a given that most large ontology applications will end up needing to 
merge multiple ontologies together given that most specifications will require sup- 
porting more than one standard vocabulary. Merging small ontologies may not be 
difficult to do manually, but once ontologies become large, it becomes more critical 
to provide systematic tool support. Merging tools should identify terms that clearly 
should be merged (i.e., objects with the same names with the same internal structure 
and semantics), terms that may be merged (i.e., terms with names that are known to 
be synonyms for each other with internal structure that is not incompatible), terms 
that may not be merged (i.e., terms that are known by their definition to be disjoint). 

Merging tools also can be used to focus the attention of the user into areas that are 
likely to require human re-work. For example, if there is a term that appears as a 
suffix of a number of other terms, it may be the case that sub-class relationships 
should exist. For example, if one ontology includes the term “weapon” and another 
ontology includes the terms “biological-weapon” and “chemical-weapon”, it is a 
likely guess that “weapon” should be a super-class of both “biological-weapon” and 
“chemical-weapon”. Merging tools may also include the sibling and partition analy- 
sis that we mentioned in Section 2.1. 

Verification and validation of ontologies is also an increasingly important task, 
particularly when ontologies are generated as the result of merging two or more 
sources. When ontologies become too large for experts to scan, they need support in 
identifying problem areas. There are a number of things that can be done automati- 
cally. For example, authoritative sources, if they exist, may be used to check data 
when it is input from multiple sources. General deductions may be enforced such as 
“no element may be an instance of more than one class in a disjoint partition”. 

Another portion of a diagnostic tool is a more subjective analysis. For example, it 
is rarely the case that cycles should exist in class graphs. For example, it is unlikely 
that we would want to say that a cat is a mammal, a tiger is a cat, and a mammal is a 
tiger. (Just as an empirical observation, cycles do show up in a number of ontologies 
currently deployed on web sites today as we discovered from crawling a number of 
sites. It is our analysis, that most, if not all of the cycles we found can be and should 
be broken.) Cycles can be identified and brought to the attention of the user. Simi- 
larly, there are many “rules of thumb” that can be used to “critique” an ontology. For 
example, typically one finds multiple sub-classes of a super-class. If a class has one 
and only one sub-class, it is usually an indication that that portion of the hierarchy has 
not been completed. A note in a log of all of the single child classes can be useful for 
knowledge engineers in focusing their attention on the areas of the taxonomy that are 
likely to be incomplete 

An environment should provide support for multiple versions. In the simplest 
case, applications will have the ontology that is currently deployed and the one that is 
under construction. More typically, in a distributed environment, there may be multi- 
ple sub-portions of the development ontology undergoing simultaneous development. 

An environment should include support for extending both high level and lower 
level ontologies. That may be best done by a combination of social and program- 
matic support. As ontologies become larger, systematic and consistent naming and 
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organizational rules become more important. Simple naming rules can help quite a 
bit. In many of our larger applications we used consistent naming conventions such 
as prefixing all roles with “has-“ so that a quick glance would determine if “mother” 
is the class of female persons who have at least one child or if “mother” is the binary 
relation between children and their mothers. If “mother” is the class and “has- 
mother” is the binary relation, then there is no ambiguity in the name. Environments 
that are used to generate ontologies can easily support and enforce naming conven- 
tions. 

Many conventions may he useful and/or required. For example, if one is con- 
ducting business in a state that taxes differently on luxury vs. non-luxury items, then 
it is critical for all products for sale to have a way of figuring out if they are luxury 
items or not. Otherwise, pricing cannot he done on orders. Required properties can 
be clearly stated in a published ontology construction methodology and also may be 
enforced by ontology editing and analysis environments. We are exploring options of 
supporting required naming conventions in commercial ontology environments. 

Some operational guidelines include: 

Incorporate an ontology merging tool into the environment. 

Employ focus of attention techniques to help assist humans in their analysis 
tasks. 

Provide a diagnostic tool set that analyzes provably incorrect information. 
Provide a diagnostic tool that suggests possible problems. 

Provide a critique analysis that suggests potentially better representation style. 
Provide a version control mechanism. 

Provide support for extending upper and lower level ontologies. 

Make conventions explicit on how the ontology is constructed and how one 
should make extensions - including naming conventions, organizational prin- 
ciples, what new levels mean (and when they should be added or deleted), 
when partitions are used and how they are added/modified), etc. 



3 Discussion 

Others have preceded us in the areas of conceptual modeling. We embrace most of 
the guiding principles laid down in the fields of conceptual graphs, description logics, 
general frame systems, object-oriented modeling, knowledge acquisition, etc. for 
building consistent, and principled ontologies. We are also not the first to focus in 
the areas of conceptual modeling for large or distributed environments (consider the 
work done at Cycorp or the work of [Swartout, et. al, 1996, Erikson, et. al, 1999, 
Fridman, et. al, 1999, Oliver, 1999], etc.). We may have had a wider range of expe- 
riences in our ontology environments including multiple description logic environ- 
ments along with frame system and theorem provers and also may have had broader 
ranging needs across the application families. The work has been driven by large 
corporations, such as AT&T, Lucent, NCR, and recently Cisco, Internet startups, and 
government-funded academic research. Our recent work on merging and ontology 




110 Deborah L. McGuinness 



diagnostics attempts to synthesize the learnings across all of the domains and may 
provide a unique combination of facilities aimed at today’s world of extensible, dis- 
tributed ontologies with emphasis on diagnostics and ontology critiquing. Our goal 
with this paper is to provide a focus for the evolving field of distributed ontology 
management. Another goal is to attempt to encourage the publication of suggested 
protocols for distributed ontology generation and maintenance. 



4 Summary 

In this paper, we have focused on issues that have greater importance when one is 
managing ontologies in a distributed manner. We have attempted to synthesize our 
guidelines from a wide range of applications developed in varying knowledge repre- 
sentation and reasoning environments with a breadth of ultimate application goals. 
We presented some structure in guidelines for conceptual modeling in distributed 
environments and welcome evolutionary suggestions to help refine the desiderata. 
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Abstract. Knowledge-based systems (KBS) are not necessarily based on a 
well-defined ontologies. In particular it is possible to build very successful 
KBS for classification problems, but where the classes or conclusions are 
entered by experts as free-text sentences with little constraint on textual 
consistency and little systematic organisation of the conclusions. This paper 
investigates how relations between such ‘classes’ may be discovered from 
existing knowledge bases. We have based our approach on KBS built with 
Ripple Down Rules (RDR). RDR is a knowledge acquisition and knowledge 
maintenance method which allows KBS to be built very rapidly and simply by 
correcting errors, but does not require a strong ontology. Our experimental 
results are based on a large real-world medical RDR KBS. The motivation for 
our work is to allow an ontology in a KBS to ‘emerge’ during development, 
rather than requiring the ontology to be established prior to the development of 
the KBS. It follows earlier work on using Formal Concept Analysis (FCA) to 
discover ontologies in RDR KBS. 



1 Introduction 

Most knowledge acquisition methodologies first build a model of domain knowledge 
before applying this to building a particular problem solver e.g KADS and 
CommonKADS [20], Protege2000 [24]. Although this approach facilitates re-use it 
does not overcome the knowledge-acquisition and maintenance bottleneck and these 
problems are present both in the development of the ontology and consequent 
problem solver. 

The RDR approach starts knowledge acquisition (KA) to build the problem solver 
immediately without any modelling apart from a simple attribute-value data 
representation [17]. Even the attribute-value representation can be developed while 
KA is in progress. The focus of the approach is to make the addition of each 
incrementally added piece of knowledge as simple and as reliable as possible. 
Although this approach facilitates KA and maintenance, it does not facilitate re-use 
because of the lack of an ontology. 

Richards and Compton [18] have previously applied formal concept analysis 
(FCA) to develop conceptual models from RDR systems and for example have 
discovered interesting models from a 60 rule blood-gas KB which was itself part of a 
larger RDR KB. The RDR approach, along with other KBS approaches, allows a 
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class to be covered by a number of disjoint rules. The approach further encourages 
this in that RDR does not allow modification of existing rules and all rules have to be 
added as new disjuncts. As will be discussed below a new rule is added by the expert 
selecting conditions to cover a misclassified case but exclude other cases previously 
correctly classified [9], [10]. 

For example, one of the classes in a real-world RDR KBS consists of a disjunction 
of 74 rules with each rule a conjunction of conditions. If this is displayed using an 
FCA tool, the class is scattered across many different nodes and it is difficult to get an 
overall impression[18]. In a simple example in figure 2, the class BUS is in two 
nodes. The real-world KB we are using has 3710 rules and 101 classes. If we try to 
display such a knowledge base using FCA tools the problem is exacerbated. Mineau 
et al. argue that in the FCA framework "the choice of conceptual scales depends on 
the purpose of the analysis. For instance, other attributes may be suitable for market 
analysis than for decision support” [15]. This may well he appropriate in exploring 
data. However, we assume that since we are dealing with rules rather than raw cases, 
the relevant attributes have already been identified and extracted. Our aim is to 
discover the appropriate ontology given that the relevant attributes (and values) are 
already well-identified. 

A second aspect of the problem is that in a real-world system, attributes are multi- 
valued rather than boolean. Rule conditions can subsume each other, be disjoint etc. 
For example age >10 subsumes age > 50, whereas age >40 and age <10 are disjoint. 
Our method needs to not only combine information about classes from across the 
knowledge base, but to address the way in which conditions based on multi-valued 
attributes interrelate (see Fig. 1). 



Class: Satisfactory lipid profile previous raised LDL noted 

(LDL <= 3.4) AND Triglyceride is NORMAL) AND (Max(LDL) > 3.4) OR 
((LDL is NORMAL) AND (Triglyceride is NORMAL) AND (Max(LDL) is HIGH) 

AND (Net_Change(LDL) <0)) 

Fig. 1. An example of multi-valued attributes in RDR. 

Finally, the problem is compounded, by the way in which the expert adds 
conclusions. When an RDR KB makes an error, the expert identifies the attributes 
and values that justify a different conclusion. In adding the conclusion, the expert can 
select from a list of pre-existing conclusions organised into broad categories, but is 
also free to simply type in a new conclusion - and often does this. In medical 
pathology result interpretation, the evaluation domain here, the conclusions added by 
the pathologist may provide advice to the referring clinician on patient diagnosis, 
management, how treatment is progressing, whether the tests ordered were 
appropriate, what tests might still be necessary or any combination of the above. It is 
quite clear to both the expert and the receiver of the advice what information is being 
provided in the free text interpretation, but these interpretations are a long way from 
the well-defined classes of a formal ontology. A task analysis would assess this 
domain as a classification problem, but this does not imply well-defined classes. 
Hence, the problem is not only that disjuncts for a class may be scattered across the 
KB, but that the same class may be represented by different text strings or a text string 
may cover a combination of different classes. Some examples are given in Table 6. 
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The question perhaps arises of whether it would be more appropriate to start with a 
well-developed ontology, as suggested by most KA researchers. There have been a 
number of RDR papers over the years presenting data on the practical advantages of 
the incremental KA approach provided by RDR. Commercial RDR systems are now 
available from Pacific Knowledge Systems Pty. Ltd. In evaluation studies of this 
software a large Australian pathology laboratory has developed about 8 RDR KBs 
ranging from a few hundred to 7000 rules. These are all in routine use and to date 
have processed over 500,000 laboratory reports, handling much of the laboratory’s 
report output. They were all built by pathologists or laboratory scientists after about 2 
hours training. The laboratory has documented that KA time was about one minute 
per rule and appropriateness of the interpretations is >98%. (It is a trivial matter to 
improve this further if really needed.) Since this development has been incremental it 
has had minimal impact on the laboratory’s normal work- flow and has been a trivial 
addition to the normal duties of staff. These results have not yet been formally 
published, but they strongly confirm the previous RDR research results. Rather than 
leading to the conclusion that it would be better to start with a well-developed 
ontology, these results raise the question of whether in practice it may be simpler to 
re-develop rather than re-use. However, if we can discover the ontologies implicit in 
these incrementally developed systems, we may be able to have the best of both 
worlds. The 3710 rule KB we are using here is an early version of the 7000 rule PKS 
KB. 

In proposing techniques to discover relations between rule conclusions (i.e. the 
class relation model (CRM)) we consider three basic relations: subsumption, mutual- 
exclusivity and similarity. The application of these is shown in the following simple 
RDR example. RDR will be explained in more detail later. For the moment it is 
sufficient to note that a class in RDR is the set of disjoint rule paths each giving the 
same conclusion and that a rule path consists of all the conditions from a rule’s 
predecessor rules plus the conditions of the rule itself. The rule paths (disjuncts) for 
this example are derived from the RDR KBS in figure 2. They are: 

- rule path 2 : class VECHICLE <- hasEngine=YES. 

- rule path 3 : class VAN hasEngine=YES, passenger>5. 

- rule path 4 : class BUS ^ hasEngine=YES, passenger>10 

- rule path 5 : class SPORT_CAR <- hasEngine=YES, passenger=2 

- rule path 6 : class TRUCK <- weight>2000kg, speed>30km/h 

- rule path 7 : class BUS ^ weight>2000kg, speed>30km/h, publicTransport=YES 

- rule path 8 : class GLIDER <- moveOn=AIR 

- rule path 9 : class GROUND_VEHICLE ^ moveOn=GROUND 

The basic idea of the technique is to combine all rules of the same class and then 
compute a quantitative measurement (from 0 to 1) of each relation ( subsumption, 
mutual-exclusivity, similarity) between two classes. We use this quantitative 
measurement as an informal confidence measure as to whether these relations exist. 
Eor example the class VEHICLE subsumes class BUS with degree of confidence 0.9; 
class VAN and class SPORT_CAR are mutually exclusive to each other with degree 
of confidence 1.0; class BUS and TRUCK have a degree of similarity of 0.3. (The 
derivation of these quantities is discussed in detail later) 
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Fig. 2. An example of a simple MCRDR KBS. 



The subsumes quantitative measure provides information on whether a class almost 
subsumes another class (for example, class VEHICLE subsumes class BUS with 
degree of confidence 0.9. Using this quantitative measure we can say class 
VECHICLE almost subsumes class BUS, rather than simply saying class VEHICLE 
subsumes class BUS or class VEHICLE does not subsume class BUS. 

These measures find the application in real examples such as: class The mild 
hypothyroidism may contribute to hyperlipidaemia is subsumed by class 
Hypothyroidism may exacerbate hyperlipidaemia with degree of confidence 0.75. 
Boolean (true/false) values also are inappropriate for the relations subsumption, 
mutual-exclusivity and similarity in real domains. 

In the 3710 rule application we considered, we found only 4 subsumption relations 
with degree of confidence 1.0; 181 mutually-exclusive relations with degree of 
confidence 1 .0 and no similarity relations with degree of confidence 1 .0. 

Since we learn from rules, not from cases, our method does not need to consider 
which attributes are relevant. Gaines [6] argues that a rule in a knowledge base is 
worth more than many cases for learning. We adopt the same viewpoint. 

In the next section we outline RDR and FCA. We then discuss the derivation of the 
quantitative measures. Finally we describe our experiment to test this technique on 
the 3710 rule KB. 2. Ripple Down Rules and Formal Concept Analysis 



1.1 Ripple Down Rules 

RDR is an attempt to deal with the problem that experts never explain how they reach 
a conclusion, rather they justify why their conclusion is appropriate and this 
justification varies with the context in which it is given [3]. With RDR the expert’s 
task is to correct errors made by the KBS. The expert decides a case should have a 
different conclusion from the one given by the KBS and justifies this by indicating 
features in the case which distinguish it from a case for which the conclusion 
provided by the KBS would have been appropriate. This results in the refinement 
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structure shown in figure 2 illustrating where a rule (rule 10) has been added to 
correct an error. 

During RDR inference, data is passed down the tree and if a rule fires, its 
conclusion is added to the case. Child rules are only evaluated if a parent rule is 
satisfied. However if one or more child rules fires, their conclusions are added to the 
case and the parent conclusion removed. Their children are then evaluated. This 
process continues down each path until a leaf node is reached or no child rule fires. 
There are a number of RDR structures. This one is known as multiple classification 
RDR (MCRDR) and was the structure used for the 3710 rule KB. 

To correct one of the conclusions given, the RDR system will add a new child rule 
under the rule that gave the wrong conclusion. The expert selects conditions for this 
rule which distinguish the case from other cases for which the parent rule conclusion 
was correct. In practice, the cases used for this are the cases for which other rules 
have been previously added, known as ‘cornerstone cases’. The case for which this 
new rule is being added will also be stored as a future cornerstone case. When a new 
rule is added, any ‘cornerstone cases’ that can reach the parent rule are retrieved. The 
expert adds a rule which excludes all of these cornerstone cases or can assign the 
conclusion as an extra conclusion for a particular case. In practice, even with 
hundreds of cornerstone cases that might reach the rule, the expert selects conditions 
to make a sufficiently precise rule after no more than two or three cornerstone cases 
have been considered . This is essentially a verification and validation technique 
incorporated into knowledge acquisition [13]. 

Figure 2 illustrates a simple example of an MCRDR KB. For example if we have 
a case; hasEngine=YES, passenger=2, moveOn=GROUND, this case will fire rule 2, 
rule 5 and rule 9. In MCRDR if a child rule is fired, the child conclusion replaces the 
parent conclusion. Thus 3 nodes fire but 2 conclusions are given: SPORT_CAR and 
GROUND_VEHICLE. If we get a new case: MOTORCYCLE ^ hasEngine = YES, 
passenger = 2,wheel = 2, this case will be misclassified as SPORT_CAR, so the 
expert uses this to build rule 10 as a child of rule 5 and the case is stored as a 
cornerstone case for future rule addition. 

The RDR exception structure provides a compact knowledge representation [2], 
[8], [14], [19], [21]. Eurther, despite the random order in which cases are presented, 
and the likelihood of experts providing less that ideal rules, manually built RDR 
systems produce compact KBs [4], [11], [22]. Gaines has also generalized RDR to 
Exception Directed Acyclic Graphs (EDAGs). He argues EDAGs are more compact 
than RDR since the ED AG graph structure avoids the fragmentation problem of the 
RDR tree structure [7]. 

Initial RDR development was concerned with classification tasks, first single and 
later multiple classification. RDR has since been extended to configuration [16], 
heuristic search [1], document retrieval [12] and a more general RDR system for 
construction tasks has been proposed [5] 



1.2 Formal Concept Analysis 

Formal Concept Analysis, as proposed by Wille [23], offers a formalisation of 
mathematizing concepts as units of thought constituted by their extension and 
intension. FCA defines a formal context as a triple (G,M,I), where G is a set of 
objects, M is a set of attributes and I is a binary relation between G and M, that is I 
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c G X M. For example the sentence, the object g has the attribute m, in the FCA 
framework can be written as gim (<=> (g,m) g I). 

In our example MCRDR KBS (see figure 2), G is {vehicle, aeroplane, ground 
vehicle, sport car, bus, truck, van} and M is (hasEngine, weight>2000, speed>30, 
moveOnGround, moveOnAir, passenger=2, passenger>5, passenger>10, 
publicTransport}. That is, MCRDR attributes have many values, giving a many- 
valued context. To obtain a concept lattice we need to convert these attributes to 
single valued boolean attributes giving a single-valued context. 

A formal concept of a formal context (G,M,I) is defined as a pair (A,B), where A 
c G and B c M such that (A,B) is maximal with the property A X B cl. The set A is 
the extent of the formal concept (A,B) and the set B is the intent of the formal concept 
(A,B). 

The subconcept- superconcept relation is defined by (Aj, Bj) <(A„ B,): AjCA, 

(<=> B,c Bj). The concept lattice is a complete lattice which consists of the set of all 
concepts of a context (G,M,I) together with the order of the relation <. 

In the formal concept analysis framework, the choice of attributes depends on the 
purpose of the analysis. In the example in figure 3 we analyse the passenger capacity 
and the type of movement. 

In the example in figure 4, we derive a formal context for all the attributes, since 
the concept BUS is created from two different contexts which we rename as BUS-1 
and BUS-2. Both of these describe the same concept but in different contexts, so they 
appear as two formal concepts. An imaginary concept BUS-1 and BUS-2 is a sub 
concept of BUS-1 and BUS-2. Since we have many classes that consist of 
disjunctions of many rules, then those classes will scattered in many nodes of the 
concept lattice if displayed in a general context (of all attributes). 

In a real-world MCRDR KBS the class Satisfactory lipid profile is a disjunction of 
2 rules (see Fig 1) similar to the toy example of the class BUS. 
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X 
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X 
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X 
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X 






X 




X 



Fig. 3. A concept lattice and formal context table 
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Fig. 4. A concept lattice and formal context table 



2 The Class Relations Model 

The measures we have developed are strictly heuristic. Other superior and perhaps 
more well founded measures may be possible. The studies here represent simply a 
first attempt at carrying out this type of analysis and integrating across disjuncts. The 
second point to note is that the purpose of these relations is to deal with non-boolean 
data. 

The technique is based on set theory. If we have two sets A and B, then the 
following indicates the possible relations between them according to the three 
relations we are considering. 



- AcB 

- BcA 

- A n B 0 
- AnB=0 

- A = B 

- A;^B 




Fig.5 shows a particular example where: B cz A, C cz A, B, B n C = 0. That is, B 
and C are mutually exclusive. 
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Let X be a class in the MCRDR framework. {X„...X^} is a set of rules which have 
class X as their conclusion. {X^ ... X^^} is a set of conditions for rule X, n+1 is the 
number of possible distinct attributes in the domain. However, in a give rule some of 
these conditions may be empty because the rule may not contain every possible 
distinct attribute. Secondly, if X. and Yj. stand for individual conditions in rule paths 
for the classes X or Y, although X^. and Y. both refer to the same attribute they may 
be different conditions . For example age = 5 0 and age >5 5.. 

In the MCRDR framework the class is given as a disjunction of rule paths [18]. If m 
is the number of rule paths for class X, then: 

m n 

class X = V ( A Xij ) ^ 

i=0 j=0 

If Y is also a class, we could define a similarity measure as follows: 

Sim (X ,. , Y..) = 0 if X^. , Yj. are different 
Sim (X ,Y ) = 1 if X ,Y . are the same 

If a is the set of distinct attributes in rule path Xj and (3 is the set of distinct attributes 
in rule path X, then we can define: 

E Sim(X.. ,Y. ) 

Similar(Xj ,Y.) = 

I a u PI 



Table 1. An example of applying Similar(X;,Yj) 



Measure 


rule- 

path 


has 

engine 


move 

on 


passe- 

nger 


weight 


speed 


public 

transp. 


Value 


Similarity 


Bus-1 


YES 




>10 








0/4 


Truck 








>2000 


>30 






different 




different 


different 


different 




Similarity 


Bus-2 








>2000 


>30 


YES 


2/3 


Truck 








>2000 


>30 












same 


same 


different 



Table 1 shows the derivation of Similar(Bus-l, Truck) = 0/4 and Similar(Bus-2, 
Truck) = 2/3. Fig 6 suggests how we can find a similarity measure between 2 classes. 
Each node in Fig 6 corresponds to a rule path, v stands for the Similar() function 
relating two rule paths or disjuncts. (In later similar diagrams v stands for the 
Subsume)) and MutualEx() measures.) Fig 6 shows that Class X is the disjunction of 
nodes 1,2 and 3 and Class Y is the disjunction of nodes 4 can 5. We propose that 
ClassSimilarity(X,Y) = (vl + v2 + v3) / 3 , where we choose the v such that all nodes 
are covered by at least one edge and the sum of v (eg. vl + v2 + v3) is maximal (eg. 
ClassSimilarity(Bus,Truck) = (0/4 + 213)12 ). 
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We can similarly define a subsumption measure as follows with andYj. again 
standing for individual conditions in a rule path for the relevant class as above. As 
before: 



Sub (Xj. ,Yjj) = 0 if Xj. does not subsume Yj^ 

Sub (Xjj ,Yj.) = 1 if Xj, subsumes or is the same as Y. (for example A>5 
suhsumesA>10) 





Fig. 6. Similarity Fig. 7. Suhsumptions 

If a is the set of distinct attributes in rule X, and (3 is 
rule Yj, then we can define: 




Fig. 8. Mutual-exclusivity 
the set of distinct attributes in 



2 Sub(X^j ,Y,p 

j=0 

Subsume(Xj ,Yj) = , if a < (3 

la u (3| 

Subsume(Xj ,Yj) = 0, if a > (3 



(3) 



Table 2. An example of applying Subsume(Xj,Yj). 



Measure 


rule- 


has 


move 


passe- 


weight 


speed 


public 


Value 




path 


engine 


on 


nger 






transp. 





Subsump- 

tion 


Vehicle 


YES 












2/2 


Bus-1 


YES 




>10 










same 




sub 









Subsump- 

tion 


Vechicle 


YES 












3/4 


Bus-2 








>2000 


>30 


YES 




not 






sub 


sub 


sub 



As before. Table 2 shows the derivation of Subsume(Vehicle, Bus-1) = 2/2, 
Subsume(Vechicle,Bus-2) = 3/4. Given that function Subsume() measures a degree 
of confidence that the first rule path subsumes the second rule path. Figure 7 suggests 
how we can find a subsumption measure between the 2 classes. It shows Class X as a 
disjunction of node 1,2 and 3 and Class Y as a disjunction of nodes 4 and 5 (each 
node again represents a rule path). We compute ClassSuhsume(X,Y) = (vl + v3) / 2. 
We choose the v such that all nodes of class Y are covered by at least one edge and 
the sum of v (eg. vl + v3) is maximal. We then compute TotalSuhsume (X,Y) = 
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ClassSubsume(X,Y) - ClassSubsume(Y,X). If the value of TotalSubsume(X,Y) is 
negative, then we exchange X and Y (see Fig. 7) (eg. ClassSubsume(Vehicle, Bus) = 
(2/2+374) / 2 ). 

We can define a mutual exclusivity measure as follows, with Xj^ and Yj. again 
standing for individual conditions in a rule path. As before: 

Mut (Xj. ,Y|P = 0 if Xj. and Yj. are not mutually exclusive. 

Mut (Xj. ,Yj.) = 1 if Xjj and Yj^ are mutually exclusive (for example A>5 and 
A<2) 

We can then define: 

MutualEx(X, Yj) = 1 , if at least one of Mut(Xj. ,Yjp=l 

MutualEx(Xj ,Yj) = 0 , otherwise ^ ^ 



Table 3. An example of applying MutualEx(Xj, Yj). 



Measure 


rule- 

path 


has 

engine 


move 

on 


passe- 

nger 


weight 


speed 


public 

transp. 


Value 


Mutual 

exclusivity 


Bus-1 


YES 




>10 








1 


SportCar 


YES 




=2 










not me. 




mut. Ex 








Mutual 

exclusivity 


Bus-2 








>2000 


>30 


YES 


0 


SportCar 


YES 




=2 










not me. 




not me. 


not me. 


not me. 


not me. 



As before. Table 3 shows the derivation of MutualEx(Bus-l, SportCar)=l, 
MutualEx(Bus-2, SportCar)= 0. If the function MutualEx() measures a degree of 
confidence that the first rule subsumes the second rule, figure 8 suggests how we can 
find a mutual exclusivity measure between the 2 classes. It shows Class X as a 
disjunction of nodes 1,2 and 3 and Class Y as a disjunction of nodes 4 and 5. We 
compute ClassMutualEx(X,Y) = (vl + v2 +v3 +v4 + v5 + v6) / 6. X and Y are 
mutually exclusive if and only if all nodes of X and Y are mutually exclusive with 
respect to each other (see Fig. 8) (eg. ClassMutualEx(Bus, SportCar) = (l+0)/2 ). 



3 Experimental Results 

The initial results are from the 9 rules of the toy example in figure 2. As can he seen 
from these results (Table 4), the measures work reasonably well 

The mutual exclusivity and subsumption relations make reasonable sense but 
nothing should be concluded from the actual values.. Of course all these classes have 
some similarity as they are all vehicles 

Of more interest are the results from the 3710 rules (see Table 5) of a real 
pathology system. The results shown in each table are the five class pairs with the 
highest similarity, subsumption, or mutual exclusivity measures. The two columns. 
Class 1 and Class 2, indicate the classes to which the relation applies. The numbers 
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correspond to the classes in Table 6. At this stage, we are not in a position to make a 
medical or pathology assessment of these results, but note the following examples: 
Satisfactory lipid profile, unless patient has CHD subsumes Lipid profile 
improving If patient has CHD aim for LDL < 2.6 mmol/L. Diabetes and 
hypertriglyceridaemia noted, with degree of confidence 1.0. 

Raised cholesterol and triglycerides. The mild hypothyroidism may contribute to 
hyperlipidaemia is similar to Raised cholesterol and triglycerides. Hypothyroidism 
may contribute to hyperlipidaemia with degree of similarity 0.7. 

Raised triglycerides with abnormal fasting glucose. Suggest glucose tolerance test 
is mutually exclusive to Hypercholesterolaemia. Suggest glucose tolerance test. 

These all seem reasonable. However, a number of the other examples do not make 
as much sense to a lay reader. 



Table 4. Some results from the 9 rule toy KBS. 



Similarity >^.3 


Vechicle 


Van 


0.500 


Vechicle 


Sport car 


0.500 


Van 


Sport car 


0.500 


Bus 


Tmck 


0.330 



MutualExclusivity > 0 


Van 


Sport car 


1.000 


Ground 






vechicle 


Gider 


1.000 


Bus 


Sport car 


0.500 



Subsura 


ption >= 0.875 


Vechicle 


Van 


1.000 


Vechicle 


Sport car 


1.000 


Vechicle 


Bus 


0.875 



Table 5. Some results from the 3710 rule real-world KBS 



Sort-bv 


Class 1 


Class 2 


1 subsume 2 


mutual 


similar 


subsume 

big-5 


2 


17 


1.000 


0.000 


0.400 


12 


58 


1.000 


0.000 


0.444 


24 


64 


1.000 


0.667 


0.166 


61 


64 


1.000 


0.000 


0.286 


24 


61 


0.939 


0.667 


0.456 



Sort-bv 


Class 1 


Class 2 


1 subsume 2 


mutual 


similar 


similar 

big-5 


29 


70 


0.705 


0.000 


0.753 


9 


31 


0.625 


0.000 


0.625 


43 


48 


0.136 


0.000 


0.592 


52 


59 


0.000 


0.392 


0.591 


13 


40 


0.029 


0.402 


0.580 



Sort-bv 


Class 1 


Class 2 


1 subsume 2 


mutual 


similar 


mutual-ex 

big-5 


4 


68 


0.036 


1.000 


0.145 


6 


18 


0.118 


1.000 


0.235 


6 


25 


0.108 


1.000 


0.212 


6 


43 


0.082 


1.000 


0.189 


6 


44 


0.000 


1.000 


0.210 
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Table 6. Some classes (clinical interpretations of results) from the 3710 rule 
Pathology KBS. 



#Id 


Interpretations or classes 


2 


Satisfactory lipid profile, unless patient has CHD. 


4 


Raised triglycerides with abnormal fasting glucose. Suggest glucose tolerance test. 


6 


Raised cholesterol and triglycerides. Suggest 2 hr post-prandial glucose and TSH if not 
previously done. 


9 


Raised cholesterol and triglycerides. Suggest TSH if not previously done. Normal glucose levels 
suggest diabetes has been excluded. 


12 


Borderline cholesterol level. Suggest glucose tolerance test in view of raised fasting glucose 
(if not a known diabetic). 


13 


Raised cholesterol and triglycerides with low HDL. Suggest repeat fasting glucose in 12 months. 


17 


lipid profile improving. If patient has CHD aim for LDL < 2.6 mmol/L. Diabetes and 
hypertriglyceridaemia noted. 


18 


Hypercholesterolaemia. Diabetes and hypothyroidism are excluded. 


24 


Mild hypercholesterolaemia with raised fasting glucose. Suggest repeat glucose tolerance test 
in 12 months. 


25 


Hypercholesterolaemia. Aim for LDL <3.4 mmol/L (<2.6 if patient has CHD). Suggest 2 hr 
post-prandial glucose to exclude diabetes. 


29 


Raised cholesterol and triglycerides. The mild hypothyroidism may contribute to hyperlipidaemia. 


31 


Marked hypertriglyceridaemia. Suggest glucose tolerance test. 


40 


Raised cholesterol and higlycerides. Suggest repeat fasting glucose in 12 months. 


43 


Borderline HDL. Otherwise satisfactory lipid profile. 


44 


Satisfactory lipid profile, unless patient has CHD (aim for LDL < 2.6). 


48 


Low HDL. Otherwise satisfactory lipid profile. 


52 


Raised cholesterol and triglycerides. Glucose level to follow. 


58 


Raised cholesterol and triglycerides with raised fasting glucose. Suggest glucose tolerance test. 


59 


Raised cholesterol and triglycerides. TSH and glucose to follow. 


61 


Suggest repeat glucose tolerance test in 12 months. 


64 


Raised triglycerides. Impaired glucose tolerance noted. 


68 


Hypercholesterolaemia. Suggest glucose tolerance test. 


70 


Raised cholesterol and triglycerides. Hypothyroidism may contribute to hyperlipidaemia. 



4 Further Work 

Firstly, there may he other similar types of measure that may he more useful in 
relating classes as a whole. We will be considering a number of these. Secondly, 
although this approach allows classes to be considered as a whole across a knowledge 
base there are still too many relations to provide a reasonable overview of the domain. 
One possible approach to make the results more accessible is to consider patterns of 
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relations rather than individual relations. In the results above, individual relations are 
ranked or a cut-off can be considered. However, we could also look for patterns 
where say, A subsumed both B and C and B and C were mutually exclusive and a 
whole hierarchy with this pattern presented to the user. Different ontological patterns 
could be considered. Preliminary results using this approach suggest that it may be 
useful for extracting more interesting information. In a similar vein it will be 
interesting to form concept clusters based on appropriate cut-offs for the relations, and 
then merge the classes and re-express the relations. Finally, the present technique 
only considers the conditions in rule paths in determining the relations. It does not 
consider any other information about the classes that could be derived from the tree 
structure. For example; although the refinement structure for RDR does not indicate 
any ontological refinement, it does indicate that an expert thought a conclusion was 
inappropriate and so should be replaced by another. We are looking at the possible 
relations between conclusions that this would allow which can be combined with the 
idea of relations presented here. Finally, we would suggest again that there may be 
considerable value in attempting to discover ontologies from knowledge bases that are 
already developed or undergoing incremental development. Secondly, although this 
work is still in an early stage, we would claim that the techniques we have presented, 
or similar techniques, seem likely to be suitable for ontology discovery from large, 
real-world knowledge bases. 
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Abstract. The traditional distinction between theories (developed to 
tackle theoretical problems) and applications (based on theories and re- 
alized to help a user) is blurred in computer science. To experiment their 
theories, computer scientists often write programs. This paper focuses on 
the features that make such a program an application (also in its soft- 
ware engineering meaning). The discussion is more specifically aimed at 
artificial intelligence applications and especially conceptual graphs ap- 
plications presented in ICCS papers, and the importance of applications 
for a scientific domain. 



1 Introduction 

In the traditional distinction between scientists, technologists, and engineers, 
engineers design and realize applications, technologists design and realize tools 
from which applications are built, and from theories developed by scientists 
tools and applications are built. This is a very simplified view of how things are 
going in informatics. As in any young science, the three aspects are closely tied, 
and, often, the cycle is short: scientists develop theories, and sometimes tools, 
in order to tackle actual problems (sometimes a same person realizes the whole 
work: e.g. Knuth is a brilliant example). Usually, as scientists, we present things 
from theory to applications: scientific hypotheses, theoretical results, tools and 
applications [6]. 

In this paper, we only consider the final goal: applications. Application is 
understood in its usual software engineering meaning, that is to say an applica- 
tion is a program of a specific type. Use of CGs for research purposes in another 
scientific domain is not considered here, even if this kind of research work could 
be called, and are sometimes called, CGs applications. 

The plan of the paper is the following: in section 2 we propose rough distinc- 
tions between several types of internal and external features of an application. 
We discuss in section 3 some external features of an application, and we distin- 
guish between an application and not yet finalized programs such as prototypes, 
or mock-ups, or experiments, etc. In section 4 some specific aspects of an AI 
application are presented. Section 5 is a brief survey of CGs applications pre- 
sented from ICCS’93 to ICCS’99, and our own situation, quite similar to other 
CG research groups involved in application development, is presented in section 
6. Section 7 is a (some little pessimistic) conclusion. 
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2 Internal and External Characteristics 

Roughly speaking, an application has internal characteristics and external char- 
acteristics. Two types of internal characteristics can be distinguished. A first 
type of internal characteristics is related to software engineering. An application 
must have precise specifications, its conditions of use must be described, it must 
be reliable and robust, it must be validated, documentation must be available, 
its evolution must be anticipated, and so on. Obviously, this does not always 
correspond to reality, and many applications, even with million copies sold, do 
not fulfill these conditions. These not finalized applications eventually sold can 
be explained by marketing choices of some companies, especially their will to 
occupy the market. Also, this illustrates the fact that it is difficult to provide 
an application perfectly robust and running on a large variety of different plat- 
forms. We will not consider here software engineering criteria. There are two 
main reasons for that. Firstly, we do not have the information needed for an- 
alyzing, according to the software engineering viewpoint, “applications” which 
interest us here, that is to say CGs applications. Secondly, there are too few 
“real” (in traditional software engineering meaning) CGs applications. 

The second type of internal characteristics is related to the scientific field 
that is the basis of the software construction. We specify in section 4, what we 
understand by an AI software. 

A program may be presented in a session “applications” of an artificial intel- 
ligence conference, or a CGs conference, but it does not become, ipso facto, an 
AI or a CGs application. In the same way, any “running” program, respecting 
software engineering state of the art, is not necessarily an application. External 
characteristics, those which consider the software from outside, like a black box, 
are essential for a software which claims to be an application. 

3 External View of an Application 

In this paragraph a set of general characteristics of an application is proposed. 
These characteristics can be used as a filter for separating applications from 
other programs such as: prototypes, tools, toy applications, mock-ups, ... These 
features also show that qualifying a software as an application is more a job for 
ergonomists and specialists in the application domain rather than for informati- 
cians. 



3.1 An Application Should Aid Solution of Actual Problems 

Building a diagnostic system in medicine, or a document retrieval system, are 
clearly actual problems which can lead to applications. It is not always so easy to 
say what an actual problem is. The following excerpt from the editorial of the AI 
special issue “40 years later” [5] does not help very much : “The first set of papers 
provide a look at how AI is being used in “applications” , problems embedded in 
and defined by a real world environment”. Many pure scientific problems are 
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embedded in and defined by real world environments, for instance SAT is such 
a problem. If somebody provides an efficient algorithm for solving large boolean 
equation sets, it will be immediately implemented in tools that will be used for 
designing optimal digital circuits: a today very real world problem. Numerous 
scientific and technical problems, in informatics as in many other domains, are 
embedded in and defined by real world environment. Real world problems can 
not be opposed to theoretical problems but rather to gratuitous problems. An 
application should be useful. 



3.2 An Application Is Useful for People Other than Designers 
of This Application 

Generally, an application is worth to socio-economic or cultural domains. When 
there is a large gap between the application domain - for instance medicine - and 
informatics, it is quite easy to say what an application is. But, computer science 
is itself a source for real problems, and computer technology and engineering use 
numerous applications. Often, there exists a large hierarchy of programs upon 
which an application is built, but, and this is characteristic for an application, 
it must have an usage value for the so-called end users (although it does not 
necessarily have an exchange value: see free software). 



3.3 Tool and Application Layers 

An operating system is a useful software, and it is usually called a tool and 
not an application. Above an OS there are assemblers, interpreters, compilers, 
programming languages, . . . , a hierarchy of layers. Some tools at level i can be 
considered as applications for the level. As in the OSI model, the term 

application could be reserved for the last layer, the one of end users. For any 
layer there is also an historical aspect (its life-cycle) : experiment, toy application, 
mock-up, prototype, alpha version, beta version, release candidate, and release, 
even beta release candidate exists ! 

Possible aims of an experiment can be to show that an application project is 
worthwhile, or to prove that some tools are operational, or to confirm a scien- 
tific hypothesis, etc. Sometimes a mock-up does not implement any computing 
mechanisms but only the external behaviour of a program. Furthermore, gener- 
ally, it is not at the final scale, and transition from a toy version to pertinent 
scale version often reserves unpleasant surprises. Normally, a prototype is at 
correct scale, but some functionalities are lacking. Building them can lead to 
serious difficulties. Experiments, mock-ups, prototypes should not be given to 
end users. 

Thus, “what are the end users ?” and “what are the end user tasks which 
could be aided by the program ?” are the very first questions to ask a program 
which aims to be an application. 
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3.4 Different Sorts of Usefulness 

Not all programs dealing with a real world problem are applications, even if they 
satisfy “buyers” . Let us consider for instance a therapy adviser in medicine. The 
program can be worth to disease specialists, not because they follow the ther- 
apeutic advices given by the program, but because during the program design 
process they have acquired a better understanding of how they make therapeutic 
decisions. 

Possible objectives of a toy application are to confirm or reinforce a scientific 
hypothesis, or to show that a tool is operational, such a program should be 
better called an experiment than an application. Usefulness of a mock-up may 
be to convince decision-makers to allow development of a “true” application. 
In the same way, a prototype can be useful for different persons (designers, 
decision-makers, developers . . . ). Thus, even if toy applications, mock-ups, and 
prototypes are important steps in an application construction process, they are 
not applications. Most of them do not lead to applications. 

3.5 An Application Should Be Used by Several Users 

The number of users of an application can vary from a handful (e.g. for a highly 
specialized design aided molecular biology system) to million users (e.g. for a 
web navigator) . A great number of users can considerably increase the difficulty 
of the final step of an application constructing process. Indeed, even if the design 
has been very carefully made, it has been realized by a small team which can 
be unaware of some needs or behaviors of million people. The alternative cycle 
between tests and modifications or new functionalities may take a long time 
before convergence, and this time is needed not only for fixing bugs but also for 
obtaining a satisfactory version for a well chosen end user sample. Furthermore, 
usually, an unmaintained program does not live a very long time. All that implies 
that software engineering internal characteristics are fundamental for a program 
in order to have a chance to become an application. 

3.6 Assessment of an Application 

An assessment protocol should be designed at the very beginning of design pro- 
cess of an application. Choices for assessing an application are not always easy 
to make: how to choose user samples ? for what tasks ? what criteria must be 
satisfied ? what parameters can be measured ? . . . 

Many parts of an assessment process are specific to the application under 
consideration. In any case, assessment does not mainly fall within the competence 
of informaticians, but falls within the competence of ergonomists and application 
domain specialists. 

3.7 Application Cost 

It can be easily deduced from the preceding lines that the initial development 
costs to obtain the first working program are just a fraction of the investments 
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necessary for the final application release. “Say what is your project cost and I 
will tell you if there is any chance you will construct an application !” 

4 Artificial Intelligence Applications 

In order to specify what are the specific internal and external features of an AI 
application, one must very succinctly present our viewpoint concerning AI. 

4.1 Some Words about AI 

AI is the part of informatics dealing with the specific domain of human intelli- 
gence considered in a restrictive meaning: the domain of reasoning intelligence. 
That is to say the capability needed for grasping and processing concepts and 
ideas, for acquiring and using knowledge, for solving problems, for learning , ... 
significantly, the two problems proposed by Turing into his foundational paper 
are: playing chess, and learning a natural language. There are other kinds of 
intelligence, for instance the practical intelligence of homo faber, which are not 
directly relevant to AI which is mainly concerned by the essential homo sapiens 
sapiens faculty. One can consider three facets in informatics: computing science 
(the study of algorithms), computer science (the study of computing hardware 
and software systems and devices), information processing science (the study 
of operational informatic models of different real world entities). For AI, the 
computing science facet is mainly specialized in computational logics, and the 
computer science facet is specialized in LISP, Prolog, massively parallel systems, 
etc. From the viewpoint of information processing, the main goal of AI is to 
construct operational informatic models of intelligent human activities, or, more 
modestly, to build programs which can aid people for intellectual tasks. Reason- 
ing capabilities - for instance inference reasonings - on large knowledge bases 
are needed for such systems. Knowledge and reasoning aspects exist in any in- 
formation processing problem, but there are different sorts of knowledge. For 
instance, physical-mathematical knowledge are different from phenomenological 
knowledge. The latter is relevant to AI. It can not be represented in the same 
way as, for example, a physical law. Generally, phenomenological knowledge is 
of a discursive nature, so it is vague, ambiguous, contextual, incomplete, and 
it must be elicited, elaborated, limited, specified, before any automatic process- 
ing. These characteristics of AI relevant knowledge entail that assessment of AI 
applications is a specifically difficult problem. 

4.2 Components of an AI Application 

One can roughly distinguish four internal basic components in an AI application. 

— The deeper component is a knowledge base management system (KBMS), 
having itself three main parts: a knowledge representation language, an in- 
ference engine, and a control mechanism. This fundamental part, which con- 
stitutes the kernel of an AI application, may be more or less generic, it can 
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be a tool (but there do not exist many generic KBMS comparable to Data 
Base Management Systems). Specific AI difficulties are due to the fact that 
interesting AI problems are usually undecidable (or semi-decidable), and 
NP-complete or NP-hard problems are simple AI problems ! 

— Interfaces are the more external component. Developers, system managers 
and end users need specific interfaces. Due to the peculiar nature of involved 
knowledge, knowledge elicitation and acquisition are made easier if there is 
a strong correspondence between interface objects and their internal repre- 
sentation. In the same way, if the application is intended to be an aid to 
an intellectual work, it is important that the system is able to represent its 
inferences in an understandable manner for the end user. End users have to 
be able to compare their reasonings with the system reasonings. 

— The programming environment is very important for the engineering quali- 
ties of the software built. All usual software qualities (portability, robustness, 
reusability, efficiency, ...) are required for an AI application, but always for 
the same reasons due to the peculiarities of knowledge considered in AI, 
evolutivity is a fundamental feature, and every programmer knows that it is 
more difficult to construct a program with loose specifications than a very 
constrained one. During the early years of AI the situation was quite sim- 
ple: LISP was dominating. Then came logic programming and Prolog, object 
paradigm and object LISPs, C++, and, more recently Java. Programming 
paradigms embedded in these languages have important consequences on the 
resulting application. Even if declarative languages are wished by AI devel- 
opers, today they widely use object programming languages. This point will 
not be discussed in this paper. It is an important topic and it deserves a whole 
study. Let us only mention two general reasons explaining the success of ob- 
ject oriented language: the existence of class libraries such as STL^ or LEDA 
[18] for C++, or the Java API, and the fact that object oriented languages 
are effective for the production of applications that can be (relatively) easily 
maintained and extended. It is necessary to also choose tools (compilers, 
libraries, etc.) easily available, on many platforms, and maintained. 

— The networking environment is also crucial, because many applications need 
distributed data in various forms. Client-server applications become more 
and more popular, specifically because operating systems integrate facilities 
for accessing networks. Even if client stations become more powerful, appli- 
cations need more processing power, which has to be done on consequently 
dimensioned servers. It is worth to take into account distributed information 
and/or distributed processing, at the beginning of the development of tools 
or applications. The choice of the development environment is important 
because some of them are not very well adapted to communications between 
processes or computers. Communication facilities explain the success of C 
and C++ - closely tied to Unix, one of the first system integrating mech- 



^ The “Standard Template Library” is a library of data structures and algorithms 
integrated in the ANSI standard for C-|— 1-. 
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anisms of communication - and of Java^ because the Java API contains 
classes facilitating the access to shared resources. 



4.3 Assessment of Applications 

One of the distinctive features of an AI application is that it must have rea- 
soning capabilities. Evaluation of such an application is then more difficult than 
evaluation of an application that only manages data or makes numeric calculus. 
For instance, the evaluation of a database management system (DBMS) focuses 
mainly on the run time of queries and the behavior of the application on large 
bases. Evaluation of DB applications does not take into account relevance of 
the given answers, because such an application must provide complete answers, 
otherwise the application is buggy. 

In the case of an information retrieval application, a difficult problem is to 
know what information is relevant for a given question. Time of physical access 
to the data is less significant than for a DBMS. Indeed, an information retrieval 
system (IRS) provides an “intellectual access” to the base, in opposition to a 
DBMS which provides a physical access, and an evaluation of the interest of data 
accessed (and how it satisfies a user) requires measurements more difficult than 
measurement of physical access time. When we talk about speed of an IRS, it 
is necessary to take into account the number of decisions that have to be done 
by a user for obtaining an answer appropriate to him rather than the number 
of milliseconds necessary for processing a query [3]. The quality of the results 
provided to a query is thus essential, because a system slower than another one 
(because it performs more reasoning) but which provides more satisfactory re- 
sults, may allow a user to obtain more quickly (with less operations) information 
he (she) is looking for. 

Precision and recall are the two parameters usually used in order to assess 
the quality of results provided by an IRS. Precision is the percentage of relevant 
retrieved documents, whereas recall is the percentage of the total potentially 
relevant documents effectively retrieved. To estimate these rates, it is thus nec- 
essary to ask a user if he (she) estimates that given documents are relevant or 
not. For precision, one “just” has to ask him (her) for each retrieved document 
if that document is relevant or not. On the other hand, recall depends on the 
quantity of relevant documents in the base, then it is necessary to ask the user 
about the relevance of all documents, and this is unrealistic because a serious 
evaluation can be done only on large bases. Thus, the recall parameter can only 
be estimated. 

Moreover, this is only an evaluation of one given query for one given user, 
which can vary over time, and generalizing such a result for a system evaluation 
is hazardous. A complete evaluation of an IRS requires to collect the satisfac- 
tion of many users on many queries on a large base. With these conditions, it 
should not be astonished that reliable evaluations of IRS are so rare [4,20]. Diffi- 
culty of evaluation of such systems, and thus difficulty of comparison of various 
approaches, explains proliferation of approaches and systems. 
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However, IR seems to be one of the simpler problems of AI because an IRS 
“just” have to return a list of information relevant to a given query. For more 
complex applications, such as industrial design or natural language processing, 
evaluation is undoubtedly more complicated. 



5 CGs Applications 

Many different program sorts are described in ICCS papers. One can mention: 

— implementation of algorithms or methods (constrained maximal join, graph 
or subgraph isomorphism, CGs matching, classification and clustering, gen- 
eralization, inference or learning algorithms, ...); 

— programs implementing new capabilities added to already existing tools; 

— numerous CG-editors, more or less integrated in existing tool; 

— general tools; 

— and, finally, programs that interest us here, those close to applications (in 
the specific meaning used in this paper). 

The classifications adopted, for program sorts as well as for domains, are 
certainly questionable. Only titles and ICCS ref. are given. Reader in a hurry 
can directly jump to the end of this section (discussion). 



Natural Language Processing 

This domain is very active since the first ICCS. Papers in NLP sessions are often 
purely descriptive, sometimes programs are mentioned, but they can hardly be 
called prototypes (of applications), generally, they only illustrate how CGs are 
used for modeling natural language notions. Among the implemented approaches 
one can find: 

— studies of specific linguistic notions, e.g. Computational Processing of Ver- 
bal Polysemy with Conceptual Structures [16], Analysis of Task Oriented 
conversation into CG structures [17]; 

— modules to be added in existing system, e.g. Representing natural language 
causality in CGs [11] to be included in a general system Kalipsos now aban- 
doned, Italian linguistic analysis module to be included in NOMOS [12], 
Natural Language Text processing and the Maximal Join Operator to be 
included in CoKEMan [14], Using the CG Operations for Natural Language 
Generation in Medicine [13] to be included in the GALEN project; 

— independent modules, e.g. Extraction of Implicit and Explicit Knowledge 
from NL Texts [12], Implementing a Semantic Lexicon [17]; 

— complete systems, e.g. RECIT a Multilingual Analyser of Medical Texts [12], 
DB-MAT : Knowledge Acquisition, Processing and NL Generation Using CGs 
[14]. 
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Information (or Document or Knowledge) Retrieval Systems 

Many experiments have been realized in this domain. Most of the following works 
are, or could lead to, applications: Linguistic Processing of Text for a Large 
Scale Conceptual IRS [12], Conceptual structures and Structured Documents 
[14], An experiment in document retrieval using CGs [15], a prototype based on 
CoGITo has been developed, CGKAT for organizing and retrieving knowledge, 
this system is using tools such as Wordnet, Thot, and CoGITo [14,15], CGs 
for Corporate Knowledge Repositories [15], The CG Mars Lander [15], Using 
Viewpoints and CG for the Representation and Management of a Corporate 
Memory in Concurrent Engineering [16], CG for Representing Business Processes 
in Corporate Memories [16], Embedding Knowledge in Web Documents: CGs 
versus XML-based Metadata Languages [17]. 



Knowledge Acquisition and Representation 

Ten or so papers, more or less far from an application, can be gathered in this 
class: Acquiring temporal knowledge from schedules [11], Modelling and Sim- 
ulating Human Behaviours with CGs [15], Menu based interfaces to CG: the 
CGLex approach [15], Knowledge extractor [15], Complex Modelling Constructs 
in MODEL-ECS [15], A semantic validation of CGs [16], Knowledge Querying in 
the CG Model: the RAP Module [16], Multikat a Tool for Comparing Knowledge 
of Multiple Experts [16], A CG based behavior system extraction [17]. 

Only five papers have answered the interesting SCG-1 initiative [17]. Among 
them, only two have built a program for solving the problem. Carrying on the 
SCG initiative was not possible this year, though we could expect more answers 
than last year. 



Requirements 

This seems to be an important potential application domain for CGs. Let us 
mention some papers: Automatic Integration of Digital System Requirements 
Using Schemata [13], Service trading using conceptual structures [13], Generic 
Trading Service in Telecommunication Platforms [15], Applying CG Theory to 
the User-Driven Specification of Network Information Systems [15], A CG-Based 
Behavior Extraction System [17]. 



Miscellaneous 

For sake of completeness, let us mention two dedicated systems: Using CGs 
as a common Representation for Data and Configuration in an Active Image 
Processing System [17], On Developing Case-Based Tutorial Systems with CGs 
[17]. 
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Discussion 

Most of the papers presented at ICCS (even during application sessions organ- 
ised almost each year) are aiming at testing ideas, approaches, algorithms, or 
techniques. Generally, they are experiments and there are only a few prototypes: 
after all this is quite normal for scientific conferences. Nevertheless, system eval- 
uations are often too superficial, even if assessing an AI system is rather difficult, 
and, what is more annoying, many systems are not sufficiently described, they 
are not easily available, and they appear and disappear very fastly. 

Distinction between applications and tools are not always easy to do, but in 
order to rapidly program experiments generic tools are needed. Besides Peirce 
workbench, the following systems aim to be general CGs tools: GGP [11], UDS 
[12], GoGITo and GoGITANT [14,16], Prolog-h-h and GG-Prolog [12], GGKAT 
[14,15], WebKB [15], WebKB-GE [16] , Deakin Toolset and EGP and GGKEE 
[15], Notio [17], Synergy [17]. These tools are more or less finished, more or less 
re-usable, rarely maintained. How many of these tools are still maintained ? The 
web page on conceptual graphs^ presents only five of them. The “Gonceptual 
Graphs Homepage”^ only adds one to the list. Whereas the old page “GG Tools’’^ 
presented twenty six tools ! What did thes e tools become ? 

If there is no more tool session, we can worry for forthcoming applications. 
Without good tools (generic, reliable, available on various systems, documented, 
maintained, efficient, etc.) it is difficult to develop applications. Let us insist on 
the importance of the SCG-1 initiative [17]. One needs benchmarks. And it is not 
astonishing that the two groups which have presented an automatic solution of 
the problem are developing general GGs tools since a long time. 

6 And Us, Wherein Are We ? 

In this section we describe situation in our GG research group, with respect to 
the “application” facet. 

We are working on a document retrieval project, called Mogador, with ABES 
(agency of french university libraries). The goal of this work is to produce experi- 
mental tools, based on GGs and Rameau^, for indexing and retrieving documents 
by their subject, and to evaluate, in collaboration with librarians, results pro- 
vided by this approach. If the evaluation is positive, i.e. if the experiment is 
sufficiently convincing for decision-makers, we hope that our ideas will be taken 
into account by the team that will design the next generation of the french doc- 
ument retrieval system. Mogador is an illustration of the distinction made in 
introduction, it is clearly not an application (in the software engineering mean- 
ing we have adopted in this paper), but it is an experiment applying GGs to 
document retrieval. 

^ http: //www.bestweb .net/~sowa/ eg 

® http: //www. cs .uah. edu/~delugach/CG 

^ http: //www. cs .uah.edu/~delugach/CG/CGTools .html 

® A thesaurus used for the indexing of documents in almost all french public libraries. 
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We are just beginning to work, with other academic and industrial teams, 
on a project called OPALES. This project aims to build a website prototype for 
cooperative work on audiovisual documents. INA (french audiovisual national 
institute) is project manager. If the prototype is successful different applica- 
tions are planned, for instance a site dedicated to teachers and another one to 
researchers. Industrial teams of the OPALES project will be in charge of the 
development of these applications. OPALES and Mogador CGs components are 
quite the same, and the four internal basic components presented in paragraph 
4.2 are the following. 

— Knowledge base management system. The CGs model used is based on nested 
positive graphs with coreferent links [7,8] and CG rules [19]. This language 
is generic and it is used in every project of our research group. The inference 
engine is based on projection. The control mechanism is not generic and we 
are defining an ad hoc mechanism for Mogador. 

— Interfaces. We have decided to use the CG model at all levels: theoretical 
model provides internal representation, reasonings are carried out by graph 
operations , and end users also handle CGs through GUI. We believe that 
it is an essential feature for building intelligent cooperative systems, i.e. for 
knowledge programming. Indeed there must be an “isomorphism” between 
what is seen by the user and the formal model, to enable faithful modeling of 
the actual data and problems, and to correctly interpret the results and the 
computation. There must also be an “isomorphism” between what is seen 
by the user and how objects and operations are implemented, in order to 
understand why and how results have been obtained. The most secure way 
to do this is to use a homogeneous model: the same kind of objects occur at 
each level [6] . 

— Programming environment. The Java language, has been chosen for porta- 
bility reasons, and it is used to develop the user interface. On the other hand, 
to carry out the processing and to handle large quantities of data, we use 
C++ for performance reasons, as well as CoGITaNT [9], the library of C+ + 
classes developed within our team since 1994 (CoGITaNT is an evolution of 
CoGITo [10]). This library is a general platform for handling CGs. 

— Networking environment. We have chosen to develop various modules ac- 
cording to a client-server architecture. For instance, in Mogador, the server, 
based on the CoGITaNT platform, manages ontological knowledge (more 
than 400000 types of concepts) and graphs, and performs graph operations. 
Clients are able to edit the graphs in graphic form, to execute operations 
and consult results of operations. Extensions we have developed have been 
designed in a generic way in order to be re-usable in other works. Thus, for 
responding to the SCG-1 initiative and for representing the Sisyphus-I prob- 
lem [2,1], we used our graph editor developed originally for Mogador, and we 
have benefitted from the extensions recently applied to CoGITaNT for the 
inference leading to the resolution itself. 
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Our research group is clearly in the same position as other CG groups, we 
have realized many experiments, and we hope to be, right now, in position for 
constructing a prototype. 

7 Conclusion 

For its durability, any AI domain needs successful applications. There are at 
least two different and important reasons for that. Firstly, from a scientific point 
of view, AI is an empirical science (two main reasons have been sketchily pre- 
sented in this paper: nature of AI problems which often are undecidable, so only 
incomplete solutions can be looked for, and the phenomenological nature of the 
knowledge AI is concerned by). AI needs experiments in order to confirm sci- 
entific hypotheses. Secondly, AI community has made many promises since its 
beginning, but only some of them have been honored. “Buyers” of AI want to 
see effective realizations and not only great ideas or sophisticated theories. 

As far as we know, if there exist operational CGs applications there probably 
exist only a handful. Nevertheless, we believe that there exist mean term poten- 
tial application domains, especially: corporate memory, information retrieval, 
specifications of complex systems. In order to build applications, efficient tools 
are needed, and, what is rather distressing, we could deliver exactly the same 
discourse that Bob Levinson and Gerard Ellis had done when they launched 
the Peirce project in 1992 ! (Furthermore, efficient tools are based on efficient 
algorithms, but this domain is not currently active in the GG community). 

We, GG community, have difficulties for analyzing, synthesizing, accumulat- 
ing our knowledge (may be should we come back to dedicated workshops ?). 
Honestly answering title question of this paper could be: we have made numer- 
ous experiments, most of the time without drawing serious conclusions, we have 
put our feet on a large area, but the distance from our starting point is rather 
short. Optimistics could say that it is only because our domain is still in its 
infancy. 
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Abstract. This paper presents some important issues that a knowledge 
engineer must consider when developing a conceptual graph (CG) based 
system, particularly in the context of a deductive data/knowledge base. These 
issues entail fundamental representation choices that must be made prior to the 
development of the system. Of course, inevitable consequences follow and 
delimit the scope of the system in terms of its representational and inferential 
capabilities. However, for industrial development, a less ambitious but realistic 
approach is often more suited to the kind of constraints usually imposed hy the 
market place: feasibility, scalability, simplicity, interpretability, portability, 
partial knowledge, time to market, etc. This paper presents and discusses some 
of these issues and sets the stage for an in-depth discussion pertaining to the 
development of CG-based systems for industrial applications, particularly for 
applications where a CG system provides the conceptual level of knowledge 
organization functionalities required by an information system. 



1 Looking at the Heart of CG Systems 

Any system is developed in three layers: external, conceptual and implementation. 
The external layer meets the needs of the end-user by allowing him to see only what 
is appropriate and relevant to his needs, i.e., usually a subset of the functionalities for 
which the system was built. It provides him with an adapted interface to the system. 
To do so, it uses a carefully defined mapping onto the conceptual layer. The 
conceptual layer is the conceptual space in which the knowledge engineer designs a 
solution space within a generic problem-solving framework, for a particular problem 
space. Issues like what information is relevant to reach a solution state, how should 
that information be represented, what operators are required to process it and in the 
end provide the necessary support to implement the sought-for functionalities, must 
be dealt with at this level. It describes how a domain is modeled so that related 
problems can be solved using some problem-solving framework. Finally, the 
implementation layer maps the conceptual layer onto operational technology. This 
three-layer architecture allows the mappings between the layers to provide 
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independence of each layer, allowing their updating without causing a global revision 
of the whole system. Each layer can thus evolve at its own pace. 

Any conceptual graphs (CG) system may also be seen as having three layers, 
although not as obvious as with database systems. The knowledge engineer is the link 
between all three layers. That sometimes causes confusion in CG related research. At 
times, the knowledge engineer is seen as a "super" user for which a particular 
interface to the system is required, one that allows the editing of the conceptual layer. 
In this case, that means CG editing facilities. 

This research community at times proposed ways to express CGs at this level. For 
instance, let us remind the community of the notation introduced by [1] to represent 
functions in the CG formalism (see Figure 1) using a double-lined arrow to express 
the functional dependency of the last argument of a conceptual relation on its 
preceding arguments. The graph of Figure 1 states that a person has only one SIN 
number (Social Insurance Number). 




Fig. 1. A functional dependency expressed using a function symbol. 



However, to be thorough, the introduction of the function symbol entailed the 
update of the conceptual layer so that the integration of this new symbol to the 
previously defined representation and inferential mechanisms of the CG notation 
would be complete. So [1] also proposed to modify the join operation to take into 
account the functional dependency that this new "function" symbol introduces. In 
brief, when all arguments of a function symbol fj are joined to some other set of 
concepts also arguments of a function symbol fj of the same type as fj, then the output 
concepts of these two functions must also be joined. With our example, if we are to 
join the PERSON concept of Figure 1 to some PERSON concept in some other graph 
to which an ID function is also connected, then the two SIN concepts (that of the 
graph of Figure I and of the other graph) must be joined as well (see Figure 2). 

Without that modification of the conceptual layer, the introduction of the function 
symbol would only have been an editing feature that would have had no impact on 
the enforcement of the constraint that it represents. Like many features in other 
conceptual modeling languages (like in UML for instance), it may have been useful to 
the knowledge engineer as a reminder of the functional dependency between some 
concepts, but it could have been written down on a piece of paper and that would 
have had the same usefulness to the knowledge engineer. To my opinion, the CG 
formalism offers much more than modeling constructs to build conceptual models, it 
provides a system of logic easy to use to describe conceptual models, plus it offers 
executable specification capabilities during the development cycle of a system. It 
would be vital that any researcher in this community that introduces some editing 
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facility to the CG theory also provides the updated conceptual layer so that its 
semantics is fully integrated to the operations of the system. Therefore, the angle that 
I chose to look at a CG system in this paper is through its conceptual layer, the layer 
in which its processing capabilities are defined. Unfortunately, the conceptual layer is 
often the hardest to grasp. The external layer offers an interface to the system and the 
implementation layer is based on technology. Thus they are both concrete, while the 
conceptual layer is used for conceptualization and can remain very abstract. Also, the 
conceptual layer is mostly used in modeling applications for information and database 
systems. Therefore, this paper views a CG system as providing the knowledge 
organization and management core for large industrial data intensive applications, 
pretty much like the core representation engine of a deductive database. Therefore, 
the recommendations made in this paper may not apply to other types of CG 
applications. Nevertheless, I hope that this paper will help trigger transfers to and 
from other uses of CG technology as encountered in the different research projects 
reported in the literature. 




yjoin -> ^join 




Fig. 2. Joining graphs containing function symbols. 



From the point of view depicted in this paper, the conceptual layer of any CG 
based system is its main core. It forms a system of logic with its own representation 
constructs and operators. It provides the knowledge engineer with a unique way of 
designing a solution space for a particular problem space. With that regard, though a 
subset of the CG theory may be mapped onto first-order predicate logic (FOL), its 
approach to problem solving, mainly based on existentially quantified concepts 
interconnected in a semantic net and on a projection operation, is unique. In 
comparison, with FOL clausal formulae, the conceptual space that a knowledge 
engineer considers in order to provide a solution space to some problem deals with 
quantification, unification, and rule chaining and ordering. This conceptual space (for 
providing a problem solving system based on production rules) is totally different 
than that of CG systems. The knowledge manipulation operators being different, the 
approach used to look for a solution space is therefore also different than with other 
representation schemes. That surely has an impact on the cognitive process that the 
knowledge engineer undergoes. 

For instance, [2] later introduced a representation framework for constraints based 
on the projection operation. Because the projection operation is at the heart of any CG 
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system, this approach provided a general and unified representation framework for all 
semantic constraints found in database literature today. In brief, a constraint graph is 
stated; it is known to be false. In order for all other graphs to be valid, there must be 
no projection from the constraint graph onto any graph in the CG system. Figure 3 
shows a constraint graph stating that it is false that a person may have two distinct 
SIN numbers, i.e., there must not be any graph in the system on which the graph of 
Figure 3 could project itself. Here the reader should notice that we use the normal 
form assumption [3]: concepts known to represent the same individual are joined. 
The framework proposed in [3] also allows exceptions to constraint graphs to be 
represented. By providing a general mechanism to represent, among other things, 
functional dependencies, the modification done to the join operation in [1] now 
becomes obsolete because any operation that would result in a graph that would 
violate a constraint would be stopped. However, it is probably more convenient to the 
knowledge engineer to edit functional dependency constraints using the function 
symbol of [1] rather than producing graphs like the one in Figure 3. Again, a mapping 
between the external layer (which could use the function symbol) onto the conceptual 
layer (which would use constraint graphs) could be defined. 

In the example above, it is obvious that the projection operation guided the 
engineering of a constraint representation mechanism adapted to the CG theory. This 
approach is totally different then the one adapted with clausal systems for instance. 



false: 




Fig. 3. A constraint graph enforcing a functional dependency. 



2 Unicity of Representation: Simple Issues 

At the conceptual layer, a CG system is a system of logic. At this point we would like 
to remind the reader that different logical representations of the same system are 
possible under the CG operators. As examples, let us cite the expansion and 
contraction operators that allow the granularity of representation to vary, adding more 
or less details into a graph. Fet us mention also that through the type lattice, multiple 
representations of an individual concept are possible through the use of different 
concept types. All these different representations are logically equivalent. 
Unfortunately, logical equivalence may be a barrier to the engineering of a CG 
system instead of being its facilitator. The operators of the conceptual layer will have 
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to be mapped onto some implementation layer, i.e., on operational technology. That 
constraint requires that the conceptual layer be chosen with care: the mapping 
between the conceptual and implementation layers may not he as easy to achieve 
under one representation as it would be under another one. Though there is not much 
work today in the CG literature on the implementation layer of CG systems, it seems 
obvious that the representation of the conceptual layer must facilitate its mapping 
onto the implementation layer (for obvious feasibility and efficiency reasons). For 
instance, since the projection operation is at the heart of any CG system, any 
implementation layer must optimize its use. 



2.1 The Normal Form Assumption 

One way to facilitate the implementation of the projection operation is to have the 
graphs of the system represented under the normal form assumption (NFA) 
mentioned above. So we now introduce a first recommendation. 

Rl. The conceptual layer of a CG based system should he represented 
under the Normal Form Assumption. 

That will prevent the projection operation to be preconditioned on previous join 
operations that could arbitrarily have been done or not. That recommendation entails 
the most specific representation assumption (MSRA) below. 

R2. Any concept should be represented under its most specific form. 

That is, if two concepts [tji ij and [t^: i,] exist with tj < t^, then only the first concept 
should be used at the conceptual layer so that all identical concepts can automatically 
be joined under the NFA. Since we now deal with the representation of the 
conceptual layer, nothing is said about how a graph should be presented to a user 
through some interface. It may be the case that the user prefers to see [t^: iJ instead of 
[tji iJ. The independence of layers allows different representations to coexist, one for 
each layer if that is necessary, though it may introduce redundancy. 



2.2 Standardization and Interpretability 

In the same line of thought, the referent of a concept should he substituted by the 
object that it represents. That is, there are different ways to represent concepts. For 
instance, we could use [INTEGER: 8] or [INTEGER: #14] to represent the 
mathematical entity 8 (an integer number), if #14 is the referent to the object 8 stored 
somewhere in memory. Even though both concepts represent the integer 8, one 
should try to minimize the use of referents whenever possible, as they introduce a 
level of indirection and impact on interpretability. The reader is referred to [4] for a 
complete coverage of this topic. 
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R3. Whenever possible, the representation of an individual concept 
should use the object directly instead of using its referent. 

Finally, in terms of syntax, any system should worry about its interconnection to the 
outside world and to other systems. There was a strong effort these past years toward 
the standardization of the CG notation that should be taken into account. Even though 
it may be more useful to the users or the knowledge engineer to be provided with 
some other syntactical format (which can be done at the external layer), the 
conceptual layer should stay closely tied to the CG standard. 

R4. The conceptual layer should be represented in standardized CG 
notation as much as possible. 



2.3 The Type Hierarchy 

As for the description of the type hierarchy, though it is tempting to hide the 
complexity of representation of some concept in very specialized types, the modeling 
of the domain should be as close as possible to the natural categories (types) of the 
objects found in the domain. This is to facilitate the conceptualization of the solution 
space pertaining to that domain, and to facilitate the interpretability of the conceptual 
layer. 



R5. The types used to describe the domain should be as close as 

possible to the natural categories found in the domain. 

Furthermore, it has been noted in the past that the engineering process of an industrial 
system is often based on partial knowledge. It has also been noted that a lattice may 
be hard to conceptualize as the support structure for the type hierarchy. For these 
reasons, we may want to have the knowledge engineer describe the type hierarchy in 
any shape that seems natural to the task (at the external layer), and have a lattice 
completion algorithm to produce the type lattice required at the conceptual layer. In 
effect, every CGer knows that a lattice structure is compulsory to a CG system since 
the join operation must verify the most general common subtype of two types before 
considering joining concepts that conform to these two types. 

R6. The type hierarchy of a CG system must be a lattice. 

Also, all types, either for concepts or relations, should be seen as primitive. That is, 
the expansion and contraction operators should be used only in cases where the 
granularity of representation must be changed for some particular applications. As an 
example, let us cite natural language processing applications: text analysis or 
summarization. At the application level, case grammar relation types could be used to 
help analyze or produce natural language sentences. At the conceptual layer, only 
those types relevant to the description of the domain should remain. A 
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contraction/expansion operation would then be needed to change the granularity of 
representation to accommodate that change of representation. 

R7. At the conceptual layer, all types should be considered primitive, 
unless specialized applications require a change of representation. 



2.4 Conformity of Individuals to Types 

As mentioned in 2.3 above, in order to facilitate the acquisition of individuals, the 
external layer of a CG system may allow multi-classification, i.e., the assignment of 
the same individual to different types. Since recommendation R6 states that the type 
hierarchy should be a lattice (because of the join operation) and since R2 requires that 
an individual be stored under its most specific form, a lattice completion algorithm 
could complete the type hierarchy so that a common subtype is introduced whenever 
an individual is assigned to two different types tj and q with tj • t^. Again, the 
independence of the external layer with regard to the conceptual layer may allow a 
user to see an individual classified under different types, though at the conceptual 
layer, it is assigned to only one type. 

R8. At the conceptual layer, there should be no multi-classification of 

individuals. 



2.5 Conceptual Relations 

When a binary relation is defined, there is no need to define its inverse relation at the 
conceptual layer, as it is implicit. However, natural language applications based on 
CG may require it. Then it could either be defined at the external layer or could be 
defined at the conceptual layer through a metamodel that would link the conceptual 
relation with its inverse. We will not get into more details about metamodeling using 
CG at this point; a forth-coming paper should introduce a theory on the subject. For 
more details see [5]. For now, we could suggest, also for interpretability purposes, to 
choose the name of a conceptual relation in terms of its role with regard to its 
arguments but also so that reading it from its input to its output arguments would 
come close to some natural language interpretation of its associated semantics. 

R9. The name of a conceptual relation should be chosen so that its 
purpose becomes clear when reading a graph that contains it, from its 
input toward its output arguments. 



RIO. Its inverse relation should be omitted from the conceptual layer. 
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Binary relations may be reflexive, symmetrical or transitive. If reflexive or 
symmetrical, a relation of the same type should be added to the same concepts, but in 
the inverse order of the parameters. 

Rll. Reflexivity and symmetry completion should be achieved. 

As for the transitivity property, a transitivity completion rule should be produced for 
that relation, and treated as a production rule (see Section 3.3 below). 



2.6 Assertions and Negations 

For simplicity reasons, only true graphs should be kept in the system so that the truth- 
value of retrieved information is never questioned and the inference mechanisms are 
kept simple, provided that the inference operators are sound and truth-preserving. 

R12. Asserted graphs are assumed to be true. 

All graphs in a CG system form a generalization hierarchy [6]. Constraint graphs 
represent empty subspaces of this generalization hierarchy, i.e., subspaces made of all 
graphs onto which they could project themselves (which are specializations of them). 
Any graph classified in such a subspace would automatically be known to be false 
since it violates a constraint. Therefore, when a graph whose truth-value is known to 
be false is encountered, it should be transformed into a constraint graph so that all the 
other graphs within the system are validated against it and the consistency of the 
system upholds. Making constraint graphs out of false graphs is actually a way to 
ensure the consistency of the system while providing the basis for explicitly handling 
negated information. 

R13. False graphs should be represented as constraint graphs. 

Again, in order to keep only true graphs in the system, the application of the 
specialization operators of the CG theory: the restrict and join operators, should be 
truth-pre serving . 

R14. A model checking procedure should be embedded in the 
application of any specialization operator. 

This model checking procedure will prove to be useful also for answering queries, as 
a negative answer to some query could be produced as soon as the query itself is 
proven to have no model that supports it (see [7]). 
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3 Unicity of Representation: Complex Issues 

3.1 Embedded Graphs, Memory Structure and Coreferencing 

An additional computation problem with the CG formalism arises when embedded 
graphs are needed to model a domain. That adds to the complexity of the projection 
algorithm. Should projections be computed for all graphs without consideration of 
their embedding? Should all graphs be asserted on the universal sheet of assertion or 
could they be divided into micro-worlds? [8] proposed the introduction of micro- 
worlds as a way to classify relevant knowledge together mainly for efficiency 
purposes. Can we determine how conceptual graphs can be divided into different 
sheets of assertion? One of the purposes of what we proposed in [9] was to ensure 
that each asserted graph would be flattened out. If so, however, the CG system would 
then handle a set of sheets of assertion instead of only one, where each asserted graph 
would not embed any other graph. In brief, in [7] we proposed to extract the most 
embedded graph in a CG, called extent graph, and associate it with the graph from 
which it was extracted, called its intent graph. We would then replace the extent 
graph in its intent graph by a generic marker. Each intent graph identifies a context, 
each extent graph is asserted only in that context. The extent graph is said to be flat 
since it does not contain any embedded graph. If the intent graph is not flat, then the 
procedure is applied iteratively until all graphs have been flattened out. Figure 5 
shows a context where the intent and extent graphs were produced from the graph of 
Figure 4. 




Fig. 4. An example of an embedded graph used to produce the context of Figure 5. 




Fig. 5. A context with its intent and extent graphs in the top and bottom rectangles resp. 
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Finally, in order to avoid duplicating graphs and to speed up the computation of 
the projection operation, we proposed in [7] to compute a lattice structure based on 
formal concept analysis [10] from all contexts produced by the procedure described 
above. Independently of the procedure applied, we can state the following 
recommendation. 

R15. Whenever possible, a graph flattening procedure should be 
applied in a way that the semantics of the graphs would not be altered, 
but that the projection operation would be facilitated. 

In any case, a single sheet of assertion for large size CG systems may bring the 
whole system into a state of brittleness. There is a need to provide fast access to the 
relevant information both at the implementation (see Section 4 below) and conceptual 
layer. At the conceptual layer, a memory organization must be designed if it impacts 
the processing capabilities of the system. For instance, in the design of a multi-agent 
system, each agent must have its private memory and may share information with 
other agents. For that application, a particular memory structure must be devised at 
the conceptual layer (see [11] for an example involving multi-agent systems). If 
background knowledge on the tasks for which the system is being developed is 
available, it may be possible to determine a memory structure that would benefit 
them. 



R16. The conceptual organization of the memory structure should be 
designed according to the tasks that the system must perform. 

And as a consequence, like any CG system developer knows, and that is especially 
true for systems having more than one sheet of assertion, the coreferencing of objects 
should be done over all sheets of assertion, that is, in a totally global manner. 

R17. The coreferencing mechanism of a CG system should be global. 



3.2 Rule Completion^ 

Another source of worry when dealing with CG systems where the projection 
operation is so important, is the rule completion dilemma. With production rules as 
described in [12] and as an example is given in Figure 6 below, should we apply the 
rules to the relevant graphs prior to the computation of the projection operation? How 
do we handle transitive rules like transitivity completion rules for instance? Rules 
may increase the size of a graph and of the overall CG base tremendously, even up to 
infinity. 

In order to remedy this problem, each asserted graph should be represented in its 
most restricted format, that is, it should not include any information that could have 



* Here we assume that the if-then rules described in this section only use the join operator and 
can not delete any information. Therefore, they are logical entailment rules. 
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been derived from the application of production rules. That way, only its kernel 
information would need to be explicitly represented, that is, the information that can 
not be derived from the application of production rules. Consequently, a CG base 
would only contain kernel graphs, acting as axiomatic information, and production 
rules from which information may be inferred. 



if: 



INTEGER: * xl 



INTEGER: *x2 



INTEGER: *x3 



then: 




Fig. 6. An example of a production rule under the CG formalism. 



Of course, before asserting any graph, a kernel extraction process must be run on 
the asserted graph, then called kernel graphs. Kernel graphs can be found by the 
application of the production rules in backward chaining mode. Unfortunately, it may 
be the case that many kernel graphs may arise from this kernel extraction mechanism. 
This is the price to pay for the unicity of representation that we seek to achieve here. 
Also, subsequent queries would have to undergo the same kernel extraction 
processing before being processed, so that the projection operation easily finds all the 
projections of a query graph onto the relevant kernel graphs. 

R18. When production rules are part of the system, a CG system 
should keep all asserted graphs under their kernel representation. 



3.3 Procedural Attachments: Actors and Processes 

Actors [6] and processes [13] may be useful to CG systems as they allow the 
inclusion of update procedures that are triggered automatically when some assertions 
are made. In brief, an actor instantiates its generic output concepts based on some 
computation from its input arguments (when all instantiated). As example, let Figure 
7 below define the > actor. This actor computes a boolean value based on the 
mathematical relation > between two integers. 
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This actor would be useful to enforce the semantics of the > relation when used in 
a CG. From recommendation R12 above all asserted graphs are assumed to be true, 
but there is no way to ensure that the semantics of the > relation in Figure 8 below is 
not violated. Since the set of integer numbers is infinite, symbolic computation would 
need the explicit representation of the > relation for an infinite set of pairs of integer 
numbers in order to compute the truth-value of this relation for particular arguments. 




Fig. 7. The definition of the > actor, called actor graph. 




Fig. 8. A CG that uses the > relation. 



However, whenever the > relation is used in a graph g, a rule could be triggered so 
that the actor graph is added to the graph of Figure 8. That would result in a call to 
the > actor and the subsequent instantiation of the Boolean concept (see Figure 9). 
The Boolean concept being instantiated to False would provide the required 
information to alert the system's administrator of this state of fact. 




Fig. 9. Joining the graphs of Figure 7 and 8 together. 
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In this example, the semantics of the > relation was encoded in a procedural form 
in the > actor (because it would not have been captured in a finite number of 
assertions using the > relation alone). Actors are thus useful to represent domain 
constraints for infinite or very large domains. They are also useful to automatically 
query a database for the benefit of some decision making process that the CG system 
is achieving. 

Processes have been defined in various ways in the literature [14]. For instance, 
[13] defines them as actors whose input and output arguments are assertions, that is, 
concepts of type STATEMENT with CG as their referents. Eor instance, through the 
use of a process that recognizes a bad use of the > relation (see Eigure 10), some 
action can be undertaken to alert the system administrator and ensure that the system 
consistency is upheld. 

Actors provide gateways to procedural computation allowing CG systems to 
benefit from both procedural and declarative programming paradigms. Processes 
implement state changes in a system. Contrarily to if-then rules, they may trigger any 
type of state change, rendering a non-monotonic evolution of the system. Therefore, 
they may be used with great care. Non-monotonic systems provide a level of 
complexity that is beyond that of first-order logic. Eor instance, kernel graphs would 
need to be recomputed for each state of the system. 



4 The Implementation Layer: A Few Tips 

4.1 Indexes 

In the same line of thought, when designing the implementation layer, indexes on the 
asserted graphs will need to be built so that the computation of the projection 
operation is as fast as can be. These indexes could be supported by a relational 
database system where an indexing mechanism is provided. But, as a first filter, an 
indexing mechanism built from the asserted graphs themselves [15] could be 
beneficial to the efficiency of a CG system. Also, feature selection algorithms could 
preprocess the asserted graphs so that the most useful subgraphs in terms of their 
discrimination power are identified and used as indexes. However, the type of index 
that is the most suited for a given application remains to be investigated. 

R19. The investigation of proper indexing mechanisms should be 
conducted so that an appropriate implementation platform is chosen in 
light of its capability to retrieve graphs efficiently whenever needed. 



4.2 The Technological Platform 

I must add also that the industry is usually not too keen on in-house solutions. One 
key element of choosing a technological platform (either software or hardware) as a 
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solution to a particular problem relies heavily on the number of year-person put in its 
development and testing, on the stability of the platform, on its expectation of 
longevity (will it be supported), on its portability, on market trends, and on its 
development, implementation and maintenance cost. 

R20. The implementation layer should rely on a well-supported 
industrial-type platform (like a relational database management 
system for instance). 




Fig. 10. A simple process (rule) that detects bad uses of the > relation. 



Though maybe not sufficient, the starting point of the implementation layer should be 
some heavily tested and broadly used industrial product. Additional add-ons may be 
needed to fit the specificity of CG based systems. The implementation layer should 
permit to add extensions to the basic platform. 

R21. The technological platform chosen as the basis for the 
implementation layer should be based on an open architecture, 
allowing gateways to other systems and programming environments. 
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In the CG literature, the implementation layer has been neglected so far. Considering 
the risk associated with the selection of a technology over the next nowadays, there 
will be no large scale use of CG systems until a well-tested platform is available. 
Consequently, CG researchers should now tackle the problems associated with the 
implementation layer of a CG system. Though past efforts have not always been 
successful for various reasons (lack of proper specifications, coordination and funds), 
I believe that individual efforts may provide enough background knowledge on the 
associated pitfalls of this implementation layer so that financial opportunities may 
eventually arise for targeted applications. 

R22. The implementation layer of a CG system should be devoted 
attention from the CG community if a large scale CG tool platform is 
to be developed, which would help disseminate the theory. 



5 Conclusion 

This paper tried to demonstrate that a CG system can be looked at under a three-layer 
architecture. To my opinion, this architecture is of extreme importance to the 
understanding and planning of new technological improvements in the CG theory. As 
a research community, we should relate our work to this architecture, providing the 
pieces of a big puzzle, pieces that will eventually fit. More emphasis should be put on 
the impact of our work onto these layers and thus on the mappings between layers. 

All three layers are important to the dissemination of the CG theory. The external 
layer introduces potential users to CG systems. It should be designed so that a "super" 
user (the knowledge modeler) is provided with simple editing facilities through a 
graphical interface. As of now, only Charger [16] seems to provide a GUI and be 
widely available (written in Java). 

Of course editing CG is far from enough. Knowledge processing under the CG 
formalism should also be available. Towards that goal, theoretical results are now 
abundant [17-23]; CG systems are well understood by the community. Some 
prototypes and specialized tools exist [16]. However, any industrial tool should be 
based on a widely supported platform where the implementation layer was carefully 
designed to meet industrial constraints. 

As a whole, a CG-based tool for industrial applications should deal with the 
following issues in order to obtain the necessary credibility in the market place: 
efficiency, scalability, portability, flexibility, simplicity of development, use and 
maintenance, and interpretability. Based on my personal experience, the 
recommendations put forth in this paper aim at facilitating the development of CG 
systems with regard to these issues. They are by no means exhaustive nor absolute; 
specific needs may require other ways of dealing with these issues. However, they 
could be considered as a first set of guidelines when having to develop large scale CG 
based systems which, in the end, is our common goal. 
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Abstract. Knowledge management, in particular corporate knowledge 
management, is a challenge companies and researchers have to meet. The 
conceptual graph formalism is a good candidate for the representation of 
corporate knowledge, and for the development of knowledge management 
systems. But many of the issues concerning the use of conceptual graphs 
as a metalanguage have not been worked out in detail. By introducing 
a function that maps higher level to lower level, this paper clarifies the 
metalevel semantics, notation and manipulation of concepts in the con- 
ceptual graph formalism. In addition, this function allows metamodeling 
activities to take place using the CG notation. 



1 Introduction 

Knowledge management, especially corporate knowledge management, is a chal- 
lenge companies and researchers have to meet. In a previous work, we compared 
conceptual graphs with other formalisms [3, 5] and we concluded that the con- 
ceptual graph formalism was a good candidate for the representation of corporate 
knowledge, and for the development of knowledge management systems. 

Conceptual graphs are a knowledge representation formalism introduced by 
John Sowa [7] where objects of the universe of discourse are modeled by concepts 
and conceptual relations that associate concepts. Conceptual graphs have been 
extensively used and studied by a large scientific community. Using conceptual 
graph formalism, a corporate memory has been developed at the Research & De- 
velopment Department of DMR Consulting Group Inc in order to memorize the 
methods, know-how and expertise of its consultants [4]. This corporate memory, 
called Method Repository, is a complete authoring environment used to edit, 
store and display the methods used by the consultants of DMR. The core of 
the environment a knowledge engineering system based on conceptual graphs. 
Four methods are commercially delivered and their documentation in paper and 
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in hypertext format is generated from conceptual graphs. About two hundred 
business processes have been modeled and from about 80,000 conceptual graphs, 
we generated more than 100,000 HTML pages in both English and French that 
can be browsed using commercial Web browsers. 

However some fundamental aspects of conceptual graphs remains ambiguous 
and not formally specified. For example, the notation of concepts is syntactically 
defined in the new CG standard [1] but its use and its semantics seems ambigu- 
ous. Figure 1 illustrates this problem of notation. How could one represent the 
notion (concept) of the person John using conceptual graphs? 






Person : #34 


Person : John 


Person : 'John' 


Person: 


1 









Fig. 1. Which notation? 



Another example is the manipulation of embedded graphs. Concepts may 
have conceptual graphs in the referent field. Figure 2 illustrates this problem of 
manipulation. How could a CG system access the conceptual graph that is in 
the referent field of this graph? 







Graph : Cat : Garfield 


— ►( eat \ — ►! Lasagna 









Fig. 2. How to access the embedded graph? 



We propose in this paper a metamodeling approach which will semantically 
define the notation and manipulation of concepts in the CG language. Little 
work has been done on metamodeling and conceptual graphs. John Esch in [2] 
introduces two relationships: Kind that links a concept to its type and Subt that 
links two concept types that are in a subtype relationship. He defines, using 
the relationship Subt, a type hierarchy for each higher order level and he links, 
using the relationship Kind, types from different levels. But Esch does not deal 
with conceptual relations and relation types. In [8] Michel Wermelinguer defines 
more formally higher order types and proposes a translation to first order logic. 
He defines one hierarchy for all the concept types and one hierarchy for all the 
relation types and organizes them in regard of their nature and their order. But 
none proposes a complete metamodel of the conceptual graph language itself. 

These two approaches are compatible and our work extends the notions in- 
troduced by these authors. We formally define the conceptual graph language 
using conceptual graphs and we propose in this paper a notation for referents 
based on the metamodel. In addition, we unify the two operators r and p [6] in 
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one general operator oj that maps a higher order level to a lower order level and 
so allows the manipulation of concepts at the lower levels. 

This paper^ is organized as follows. Section 2 presents an overview of the 
metamodel of the conceptual graph formalism and Section 3 details five basic 
components of this metamodel: element, concept, referent, individual concept 
and generic concept. Based on this metamodel, Section 4 formalizes the notation 
of referents in concepts and Section 5 introduces the function oj that links a 
concept to the entity it represents and illustrates the use of this function in 
metamodeling, laying the foundation for a CG theory of metamodeling. Section 
6 concludes and presents future work. 

2 CG Metamodel Overview 

This section gives an overview of the conceptual graph metamodel. The meta- 
model defines the basic components needed to represent knowledge. Figure 3 
presents the concept type hierarchy of the CG language metamodel. 




Fig. 3. Metamodel Concept Type Hierarchy. 



At the highest level, we have external elements (ExternalElt) that are part 
of the real world to be represented and internal elements (Interna I Elt) that are 
building blocks of the CG language. 

External elements represent entities of the Universe of Discourse that are 
outside of the system but can be referenced by internal elements. 

Internal elements are categorized under six types: Referent, Graph, Context, 
Type, CorefLink, and GraphElement. Referents are the proxies that stand for entities 
of the Universe of Discourse; Graphs are the sentences of the CG language; 
Contexts help to cluster knowledge; Types are categories to classify entities; 
Coreference Links associate elements that represent the same entities; and Graph 
Elements are concepts, relations and arcs. 

^ This work is part of a research project supported by HEC Montreal. 
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Among concepts we distinguish individual concepts (IndividualConcept) that 
represent identified entities and generic concepts (GenericConcept) that represent 
unidentified entities. 

Graph has two specialized subtypes: Definition Graph and Restriction Graph. Def- 
inition graphs are used to define concept types and relation types. Restriction 
graphs are graphs that must always be false. They are used to state constraints 
on types. If and Then are special cases of restriction graphs where the conditions 
they represent must already be present or be acquired simultaneously. 

3 CG Metamodel Components 

This section details the core components of the conceptual graph formalism that 
are relevant to the notation and manipulation of concepts. We introduce a formal 
definition using conceptual graphs of the five basic concept types related to con- 
cept: Element, Concept, Referent, Individual Concept and Generic Concept. But before, 
in order to understand specification graphs in formal definitions, we illustrate 
on an example how concept types are specified. 

3.1 Concept Type Specification 

Concept Types are defined by three kinds of graphs : definition graph, restriction 
graph, and rule graph as illustrated in Figure 4. This figure shows the graph that 
specifies the concept type Driver. Definition, restriction and rule graphs give the 
necessary and sufficient conditions to recognize instances of Driver. 




Fig. 4. Type definition of Driver. 



Definition Graph. The definition graph shows the conceptual relations that a 
concept must have. In the example a person that drives a car is a driver under 
the conditions stated by restriction graphs and rule graphs. 
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Formally, in [7] , a concept type is defined by a lambda expression, the example 
in Figure 4 states the following equation : 

Driver = [Person : A] —>■ (drives)— >■ [Car] . 

The symbol A shows that the concept Person is the formal parameter. In the 
graphic form and in the linear form the symbol A is replaced by a question mark. 



Restriction Graph. Restriction graphs specify supplementary and necessary 
conditions; these graphs are forbidden graphs or subgraphs. They show particular 
topologies that must not exist. In our example, the restriction graph states that 
a driver cannot drive two cars^. 



Rule Graph. Rule graphs specify other supplementary and necessary condi- 
tions that are expressed more easily with rules^. In our example, the rule graph 
states that if a driver drives a car on a road then he has a driver license. 

3.2 Element 

The conceptual graph language is made of elements that are combined to rep- 
resent knowledge. Figure 5 shows the specification graph of type Element. An 
element is symbolized by one or more symbols but one symbol may not be 
linked to two different elements. Symbols are unique identifiers in the system. 




Fig. 5. Specification graph of type Element. 



3.3 Concept 

The notion of concept is the fundamental notion of the conceptual graph theory. 
A concept is the representation of an object, an idea or any notion one can 
perceive and express. 

^ Two different boxes represent two different concepts. 

® Any rule graph can be rewritten as a restriction graph. 
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Definition 1. A concept is the representation of an object of the Universe of 
Discourse. It is the assembly of two parts: a referent that identifies the repre- 
sented object and the type that classifies it. 

Figure 6 shows the specification graph of type Concept. The definition graph 
states that a concept is an internal element that has a type, a referent, and is 
element of a graph. Restriction graphs clarify building rules, a concept has one 
and only one type and one and only one referent. 




Fig. 6. Specification graphs of type Concept. 



One can notice that one type may be associated to several different referents 
and one referent may be associated to different types and make up different 
concepts. This later mechanism provides the representation of point of views, 
one object may be perceived in different ways. 

Examples of concepts are : 

[Person] , [TaxiDriver] , [Concept] 

[Person : #Tom] , [Concept : #624] , [ConceptType : Person] 

Examples of concepts whose concept type is Concept are: 

[Concept : #624] , [Concept : [ConceptType : Person] ] , 

[Concept : [Referent : #624] ] 

3.4 Referent 

The referent is the part of a concept that represents and identifies the object of 
the universe of discourse for which the concept is an interpretation. 

Definition 2. A referent is a proxy for an object of the universe of discourse in 
the knowledge base. A referent is made up of a quantifier and a designator that 
refers to the object. 
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Figure 7 shows the specification graph of type Referent. A referent is a part of 
a concept and represents an element (external or internal). Restriction graphs 
state that one referent represents one and only one element and two referents 
cannot represent the same element. There exists a special referent written #blank 
that ’represents’ an element which exists but is not identified (3.6). 




Fig. 7. Specification graph of type Referent. 



Examples of referents are : 

#12, #blank, Tom, Person, V*x. 

3.5 Individual Concept 

Definition 3. An individual concept is a concept whose represented entity is 
known. The concept is an interpretation of an object on the universe of discourse 
that is identified and represented in the knowledge base by a referent other than 

#blank. 

Figure 8 presents the specification graph of type IndividualConcept. An individual 
concept has a referent that represents an element. The referent of an individual 
concept is different from the blank referent. 

Examples of individual concepts are : 

[Person : Tom] , [Referent : #624] , [ConceptType : Person] 



3.6 Generic Concept 

There exist two kinds of concepts depending whether the entity represented by 
the concept is known or unknown. 
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Fig. 8. Specification graphs of type IndividualConcept. 



Definition 4. A generic concept is a concept that represents an unknown entity. 
The concept is an interpretation of an object of the universe of discourse that 
exists but is not identified. 

Figure 9 presents the specification graph of type GenericConcept. A generic 
concept has the special referent #blank that represents an unidentified element. 
In practice this referent is omitted. 




Fig. 9. Specification graphs of type GenericConcept. 



Examples of generic concepts are : 

[Person], [TaxiDriver : #blank] , [Concept] 



4 Notation 

This section formalizes the notation of concepts in the conceptual graph formal- 
ism. Individual concepts are the basic components of the CG language. Individ- 
ual concepts are concepts that are abstractions of well identified entities. The 
referent slot of an individual concept is symbolized by a unique literal preceded 
by the symbol #. The referent represents the identified entity in the system. The 
entity of the universe of discourse itself may be symbolized. Figure 10 presents 
the underlying metamodel. A concept has a referent. This referent is symbolized 
by a symbol and represents an element that may also be symbolized. 

An alternative for the notation of the concept is to replace its symbol by 
any symbol of the represented element. This mechanism may be formalized by 
a metalevel rule as illustrated in Figure 11. If an element represented by the 
referent of a concept is itself symbolized by symbols then the referent may be 
symbolized by the same symbols. 

We illustrate this rule on three different examples: notation of concept, no- 
tation of concept type and notation of graph. 
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Fig. 10. The metamodel of symbolization. 
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Fig. 11. The notation rule. 



4.1 Notation of Concept 

A lot of different notation has been used in the literature to denote individual 
concepts. We give here the example of the representation of the person John. 
According to the metamodel of Figure 10, the metalevel conceptual graph de- 
scribing the concept that stands for the person John is presented in Figure 12. 
The concept [Person: #34] has a referent that is symbolized by #34. This referent 
represents the element John (an external element), which can be symbolized by 
different symbols: a string, a symbol^ and an image. 




Fig. 12. The meta conceptual graph for notation of John. 



^ A symbol is different from a string. It is a whole and cannot be edited 
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By applying the notation rule defined above the referent #34 may be replaced 
by any of the symbols that stands for the person John. Figure 13 shows the 
possible notation of the same concept. 






Person : 'John' 




Person : #34 




Person : John 


Person: l||l 





Fig. 13. Different and equivalent notations of John. 



4.2 Notation of Concept Type 

In the literature concept types, when used as concepts, are denoted with the type 
label in the referent field. Figure 14 presents the meta level conceptual graph 
describing the concept that stands for the concept type Person. The concept 
[ConceptType:#15] has a referent symbolized by #15 The referent represents the 
type Person that is symbolized by the symbol Person. 













Concept: ConceptType : #15 


-►[ ref W 


♦j Referent ^ 


rep 


I Element 













;symbj (symb; 



t 

Symbol;#15 
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Symbol:Person 



Fig. 14. The meta conceptual graph for notation of the type Person. 

By applying the notation rule the referent #15 may be replaced by the symbol 
that stands for the type. Figure 15 shows the two possible notations of the 
concept type. 



ConceptType : #34 






ConceptType : Person 



Fig. 15. Different equivalent notations of the type Person. 



4.3 Notation of Graph 

Concepts that represent conceptual graphs are denoted with the type Graph in the 
type field and the graph itself in the referent field. Figure 16 presents the meta 
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level conceptual graph describing the concept that stands for the graph [Cat : 
Garfield]— (eat)— )>[Lasagna]. The concept [Graph: #72] has a referent symbol- 
ized by #72. The referent represents the graph that is symbolized by a symbol 
Garf ieldMeal, a linear form and a first order logic form of the graph. 




Fig. 16. The meta conceptual graph for notation of the Garfield meal. 



By applying the notation rule the referent #72 may be replaced by any of 
the symbols that stands for the graph. Figure ?? shows possible notations of the 
same graph. 




Fig. 17. Different equivalent notations of the graph Garfield Meal. 



Using this notation mechanism, we can define named elements. We will 
write [Graph : GarfieldMeal [Cat: Garf ield] —>■ (eat)— >■ [Lasagna] ] to state that 
GarfieldMeal and [Cat: Garfield]— >-(eat)—> [Lasagna] are two symbols of the 
graph. Any reference to the graph may be subsequently done by the concept 
[Graph : GarfieldMeal] . For example, this mechanism may be apply with situa- 
tions; it allows to name situation and avoid to repeat every time the graph that 
describes the situation when we make a reference to this situation. 

5 Meta Level to Data Level and Data Level to Entities 

This section introduces the function that links a concept to the entity or entities 
it represents. 

In [6] Sowa describes a mapping between the meta level and the data level. 
To translate a meta level statement into a data level statement, Sowa introduces 
two functions r and p. The function t translates a referent name into a type 
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label. The example below illustrates the use of the function r. The meta level 
statement: ti is a subtype of t 2 is transformed into a data level statement: every 
ti is a t 2 . 

IF : [Type : *ti] -> (subt) -> [Type : *t2l 
THEN : [rti : V*x] [rt2 : ?x] 

The function p has the same behavior as r on relation types and relations; it 
translates the name of a relation into a relation type label. The example above 
presents the translation rule from the meta level to the data level in Entity- 
Relationship diagram. 

IF : [Type : *ti] -> (argl) -> [Relation: *r] <- (arg 2 )<- [Type : *t2] 

THEN : [rti]->(pr)->[rt2] 



5.1 Function ui 

More generally, we need a function to access the entity represented by a concept 
in order to use it. For example, we would like to access the image represented 
by the concept [Drawing : BeautifulLandscape] or we would like to be able to 
manipulate the graph represented by the concept [Graph: [Cat]->(on)->[Mat]] . 
Let us define uj as such a function. 

Definition 5. The function io is defined over C ^ £ where C is the set of 
concepts that represent entities of the system and £ is the set of all referenced 
elements (internal and external elements). 

Applied on a concept the function uj returns the entity represented by the con- 
cept. Obviously, the function is defined on the set of concepts that represent 
entities of the system. 



cu( [Drawing : BeautifulLandscape]) = 
uj(. [Graph : [Cat] -> (on) -> [Mat] ]) = [Cat] -> (on) -> [Mat] 




5.2 u) versus r and p 

Using the function w, the above example using r may be rewritten as follows: 

IF : [Type : *ti] -> (subt)-> [Type : *t2] 

THEN : [w ( [Type : *ti] ) : V*x] [a;( [Type : *t2] ) : ?x] 

where tj([Type:*ti]) returns the entity represented by the concept [Type:*ti] 
that is the type represented by ti and w( [Type : *t2] ) returns the entity repre- 
sented by the concept [Type:*t2] that is the type represented by t2. 

To show the equivalence with r and p if we replace cu( [Type : *ti] ) by ojti. 
The rule becomes: 



IF : [Type : *ti] -> (subt) -> [Type : *t2] 
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THEN : [o>ti:V*x] [tut 2 :?x] 

In a same way, the translation rule for an E-R diagram becomes: 

IF : [Type : *ti] -> (argl) -> [Relation: *r] <- (arg2)<- [Type : *t 2 ] 
THEN : [oiti] ->(wr)-> [o)t2] 



5.3 Function ui and Metamodeling 

The function w maps a higher level to a lower level. To show the power and the 
use of the function oj, we give below four examples using the function. Using w we 
give the definition of the transitivity, symmetry and anti-symmetry properties of 
a relation. Such definition uses generic relation types at the higher and lower level 
in specification graphs. So before giving the associated definitions, we present 
the notation of co-referenced types that are mapped from a higher level to a 
lower level. 

Concepts that represent co-referenced concepts are denoted with the same 
symbol in the referent field preceded by an asterisk and by a question mark. 
Figure 18 presents the meta level conceptual graph describing two co-referenced 
concepts that stand for an unidentified Concept Type®. 




Fig. 18. The meta conceptual graph for notation of coreferenced concepts. 



Two concepts that are linked by a coreference link are abstractions of the 
same element. Therefore they have the same referent that represents the uniden- 
tified element. Applied on these concepts the function ui returns the unidentified 
element that is represented by the referent. For simplicity reasons, wC [Type:?ti] ) 
will be denoted ?ti and i<;( [Type : *ti] ) will be denoted *ti. 



Transitivity. A relationship TZ is transitive if and only if : 

xTZy A yTZz => xTZz (1) 

Figure 19 presents the graph that defines a transitive relationship at the meta 
level. 

® In a coreference link the defining concept is denoted with an asterisk and the bound 
concept is denoted with a question mark. 
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Fig. 19. Definition rule of a transitive relationship. 

Symmetry. A relationship TZ is symmetrical if and only if : 

xTZy yTZx (2) 

Figure 20 presents the graph that defines a symmetrical relationship. 




Fig. 20. Definition rule of a symmetrical relationship. 



Anti-symmetry. A relationship TZ is anti-symmetrical if and only if : 

xTZy A yTZx ^ x = y (3) 

which is equivalent to 

-'{xTZy A yTZx A x ^ y) (4) 

Figure 21 presents the graph that defines an anti-symmetrical relationship®. 

6 Conclusion 

In this paper, we introduced an approach to clarify the semantics, notation, and 
manipulation of concepts in CG language. This approach uses metamodeling 
constructs based on CG language. This demonstrates that conceptual graphs 
may be used in metamodeling activities. The function iv that maps a higher 
level to a lower level allows the manipulation of concepts from different levels. 

® Note: two different boxes represent two different concepts 
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Fig. 21. Definition rule of an antisymmetrical relationship. 



With a simple problem, the representation of concepts, we showed that 
through the formal definition of a meta level and of mapping functions from 
one level to the next, we can represent in a uniform way higher level and lower 
level and navigate between them. There is an obvious need for a complete the- 
ory of metamodeling in the CG formalism. We are currently developing such a 
theory. 
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Abstract. The main conceptual contribution in this paper is to present 
an approach to knowledge representation and reasonings based on la- 
beled graphs and labeled graph homomorphism. Strengths and weak- 
nesses of this graph-based approach are discussed. Main technical con- 
tributions are the followings. Fundamental results about the kernel of 
this approach, the so-called simple graphs model are synthesized. It is 
then shown that the basic deduction problem on simple graphs is essen- 
tially the same problem as conjunctive query containment in databases 
and constraint satisfaction; polynomial parcimonious transformations 
between these problems are exhibited. Grounded on the simple graphs 
model, a knowledge representation and reasoning model allowing to deal 
with facts, production rules, transformation rules, and constraints is pre- 
sented, as an illustration of the graph-based approach. 



Introduction 

The main conceptual contribution in this paper is to present an approach to 
knowledge representation and reasonings based on labeled graphs and labeled 
graphs homomorphism. The kernel of this approach is the so-called simple graphs 
model, which has three essential characteristics: objects are labeled graphs; rea- 
sonings are based on graph operations and basically on labeled graph homomor- 
phism; the model is logically founded. This basic model has been extended in 
several ways, keeping these fundamental properties. 

Technical contributions in this paper are mainly the followings. 

— Fundamental results about the simple graphs model are synthesized and 
revisited. 

— It is shown that the basic deduction problem on simple graphs (compar- 
ing two graphs by subsumption) is essentially the same problem as con- 
junctive query containment in databases and constraint satisfaction, despite 
their very different formulation; polynomial parcimonious transformations 
between these problems are exhibited. 

— Grounded on the simple graphs model, the constrained derivation model, 
which allows to deal with facts, production rules, transformation rules, and 
constraints, is presented. 
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A part of technical results provided in this paper are new, or have been pub- 
lished in French only. The paper is organized as follows. Next section is devoted to 
basic notions and results about the simple graphs model. Then, correspondences 
between the simple graph deduction problem and, on one hand the conjunctive 
query containment problem in databases and, on the other hand, the constraint 
satisfaction problem, are exhibited. Section 3 introduces the constrained deriva- 
tion model. We finally discuss benefits and limitations of our approach. 

1 The Simple Graphs Model 

We assume the reader is familiar with the very basic notions about conceptual 
graphs, namely simple graphs (conceptual graphs without negation), their logical 
translation by the projection operator and co-reference links [Sow84]. 



1.1 Simple Graphs 

Basic ontological knowledge is encoded in a structure we call a support. We 
consider here a primitive version of a support with 3 components, say S = 
{Tc,Tr,I). Tq and Tr are finite partially ordered sets and respectively denote 
the relation and concept type sets. Tq possesses a greatest element, noted T. 
Relation types may have any arity greater or equal to 1, and two comparable 
relation types must have the same arity. / is a set of individual markers. The 
following partial order is defined on the set of markers I U {*}, where * denotes 
the generic marker: * is the greatest element and elements of I are pairwise 
noncomparable. G = {R, G, E, 1) denotes a simple graph, where R and 
C are the relation and concept node sets, E is the set of edges (with a total 
order on edges incident to a relation node), and I is the node labeling mapping. 
A simple graph is not necessarily connected. Also note that it may be empty. 
Relation node labels are partially ordered by the partial order on Tr. Concept 
node labels are partially ordered by the product of the partial order on Tc and 
the partial order on the set of markers, lU {*} (a label (t, m) is less or equal to 
a label (C, to') if t < t' and to < to'). 

For simplicity reasons and because they do not add expressive power to simple 
graphs, we first exclude co-reference links from the basic model. However, we will 
show at the end of this section that all results extend to simple graphs with co- 
reference links. 



1.2 The < Relation 

The specialization /generalizationrel&tioxi (denoted by < or >) — or subsumption 
relation — is the fundamental notion for reasoning with simple graphs. This 
relation can be defined in terms of a sequence of elementary graph operations: 
given two simple graphs G and H, H < G { G subsumes E[) if E[ can be obtained 
from G by a sequence of such operations. It can also be computed by a global 
operation, the so-called projection, which essentially is a graph homorphism. 
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In the original presentation of conceptual graphs [Sow84] , simple graphs are 
connected graphs. Specialization operations (called canonical formation rules) 
are defined and it is proven that, given two (connected) simple graphs G and 
H, H < G implies the existence of a projection from G to H. [CM92] proves 
the reciprocal. See also [MC93] or [Mug95] for a precise definition and a study 
of the notion of a specialization derivation. 

In the proposed standard for conceptual graphs [Sow99], simple graphs are 
no more necessarily connected graphs (and may include co-reference links, but 
this aspect will be taken into account later). A new set of canonical formation 
rules is proposed: two equivalence rules (copy and its inverse, simplify), two 
specialization rules (restrict and join) and two generalization rules (unrestrict 
and detach), which are the inverse of specialization rules. These rules are said 
to be the foundation for all logical operations on CGs. 

Indeed, the set of equivalence and specialization rules (or the set of equiv- 
alence and generalization rules) is logically sound, i.e. if H is obtained from G 
using these rules (say H < G) then ^{H) implies d>{G) — keeping into account 
^(S'), the logical interpretation of partial orders on types in the support S. But 
it is not complete, in the sense that it does not allow to compute all logical 
deductions (see Figure 1). 




d>{G) implies ${H), but there is no mean with the proposed canonical formation rules 
of deducing one graph from the other one. 



Fig. 1. H should subsume G 



Let us propose another set of rules, which ensures the desirable soundness 
and completeness properties (this set has originally been defined in [CM95]; 
here, some operation names have been changed in order to avoid confusion with 
[Sow99]). There are four specialization operations and four inverse operations, 
called generalization operations. In both cases, the first operation produces a 
graph equivalent to the original graph (where “equivalent” can be understood 
in the sense of the < relation or in a logical meaning). For all operations, the 
resulting graph is disjoint from the operand graph(s). 

Specialization operations. Let G be a simple graph; a simple graph H can 
be obtained from G by: 

— Relation simplify: delete a duplicate relation node of G (i.e. a relation 
node r such that there is another relation node with the same type and 
exactly the same z-th neighbors, for all i). 
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— Restrict: decrease the label of a relation node (its type) or the label of a 
concept node (its type and/or its marker). 

— Join: merge two concept nodes of G with same label. 

— Disjoint sum: Let K he & simple graph disjoint from G; H is obtained by 
juxtaposing G and K. 

Generalization operations. Let G be a simple graph, a simple graph H can 
be obtained from G by: 

— Relation duplicate: duplicate a relation node of G (i.e. adds a relation 
node with the same type and exactly the same z-th neighbors, for all i, as 
an existing relation node). 

— Unrestrict: increase the label of a relation or concept of G. 

— Detach: split a concept node c into nodes Ci and C 2 , with the same label as 
c, such that the union of the sets of edges incident to ci and C 2 is the edge 
set of c — one of the two sets may be empty. 

— Substract: delete some (possibly all) connected components of G. 

is a specialization of G {H < G) ii H can be derived from G by special- 
ization operations. Since disjoint sum is a binary operation, the derivation of H 
from G may have other sources than H. G is a generalization of (G > H) if 
G can be derived from iJ by a sequence of generalization operations. Note that 
if G > H, then G can be obtained from H by considering H only (in terms of 
[MC93], the derivation from G to is a path, and not a graph with several 
sources). It is straightforward to prove that G < H iffG>H. 

Notice both sets of specialization and generalisation operations are minimal 
(no rule can be simulated with the other rules) and correspond to very elementary 
operations. 



1.3 Projection 

Projection [Sow84] is the fundamental operation on simple graphs since it allows 
the effective computation of the < relation. This operation is essentially a labeled 
graph homomorphism. Let us recall that a graph homorphism (or morphism) U 
from a graph G = {Xq, Eq) to a graph H = {Xh, Eh), where X and E respec- 
tively denote the node set and the edge set of a graph, is a mapping from Xq to 
Xh which preserves node adjacency, i.e. if xy is an edge of Eq, then n{x)II{y) 
is an edge of Eh- A labeled graph homomorphism is a graph homomorphism 
which in addition preserves labels (on nodes, and edges if the graph is labeled 
on both) . A projection is a generalization of a labeled graph morphism; a partial 
order on node labels is added and the following additional condition has to be 
satisfied: for any node x, the label of x is greater or equal to the label of II {x). 
When the partial order is trivial (all labels are noncomparable), one comes back 
to a classical labeled graph homomorphism. Note that this definition ensures 
that the node bipartition and the ordering on edges incident to a relation node 
are preserved. 
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Theorem 1. Let G and H he two simple graphs; H < G iS there is a projection 
from G to H. 

Proof. The proofs of [Sow84] and [CM92] — which apply to connected simple 
graphs — are trivially adapted. 

Finally, let us recall some results about the structure of the specialization 
relation: 

1. < is a preorder relation (and not an order) (e.g. [Jac88]) 

2. Let us say that two simple graphs G and H are equivalent if they subsume 
each other; an irredundant graph is a graph that is not equivalent to one of its 
subgraphs. Then any equivalence class for < contains a unique irredundant 
graph (up to isomorphism); moreover this graph is the smallest graph of the 
equivalence class 

3. The restriction of < to irredundant graphs is a lattice. 

Proof. See [CM92] for property 2 and [CM95] for property 3. 

1.4 Soundness and Completeness of Graph Operations 

The semantic <L> assigns a first-order formula 'P{G) to each graph G [Sow84]. <P{G) 
is a positive, conjunctive and existentially closed formula. And to a support S 
is assigned a set of formulas, d>{S), which corresponds to the interpretation of 
the partial orderings of Tr and Tq. For all types ti and t 2 such that ti covers 
^2, one has the formula Va;i...a;p(t2(a;i, ■■■,Xp) — >■ ti(xi, ...,Xp)), where p = 1 for 
concept types, and p is otherwise the arity of the relation. 

Theorem 2. (Soundness) Given two simple graphs G and H defined on S, if 
G projects to H then N <P{G). 

Proof. Similar to the proof of [Sow84] for connected graphs. 

Theorem 3. ( Completeness) If H is in normal form (i.e. there are no two nodes 
with the same individual marker) and N <P{G) then G projects to H. 

Proof. See [CM95]. The proof is similar to that of [CM92] for connected graphs. 
Notice that if H is not in normal form the result is false: see Figure 2. 

The normal form of a graph is easily computable (merge all individual nodes 
with same marker) . However, depending on constraints on types associated with 
individual markers, it may be impossible to compute a type corresponding to 
the merging of two nodes. In the simplest case, when individual markers always 
appear with the same type in a concept node, any graph possesses a normal 
form, and the logical formula assigned to the normal form of a graph is trivially 
equivalent to that of the graph. 

Two alternative logical semantics have been proposed in [PMC98] and 
[CMS98] [Sim98]. [PMC98] defines a new logic, in which deduction is equiva- 
lent to projection. [CMS98] [Sim98] proposes a variant of called L', for which 
projection is sound and complete w.r.t. FOL deduction without any restriction. 
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${G) — ${H) = 3xt{x) A t{a) A t{a) A r{x, a) A s{x, a) A u{x, a) 

G and H have same logical translation but they are noncomparable by projection. 

Fig. 2. The need for normality 



These semantics rely on the same intuitive idea: if several nodes represent the 
same entity, they a priori correspond to several different viewpoints on this 
entity; thus, the graph obtained by merging these nodes is not considered as 
equivalent to the original one. 

1.5 Co-reference Links 

Co-reference links do not add expressive power to the simple graph model. In- 
deed, there are two ways of translating them in FOL. If co-reference links allowed 
between generic concept nodes only, the same variable can be associated to co- 
referent nodes. If co-reference links can link generic and individual nodes, the 
equality operator is needed to express that terms associated with these nodes are 
equal. In both cases, the logical formula assigned to a graph with co-reference 
links is equivalent to that of the graph obtained by merging all co-referent nodes 
(provided that these nodes have the same type)^. 

More generally, let us say that two nodes are co-identical if they refer to the 
same entity, i.e. they are generic nodes related by a co-reference link or individual 
nodes with the same individual marker. Formally, co-identity is an equivalence 
relation on concept node set of a graph, denoted by co-ident. The normal form 
of a simple graph with co-reference links is obtained by merging co-identical 
nodes. In other words, a graph is in normal form if the co-identity relation on 
its concept nodes is the identity. The logical formula associated with a graph is 
equivalent to that associated to its normal form. 

Let us extend previous canonical formation rules in order to deal with co- 
reference links. The set of specialization rules (Relation simplify. Restrict, Join, 
Disjoint Sum) is added with a new rule (Co-reference addition), and, since a part 
of the Join operation becomes redundant. Join is replaced with a rule merging 
co-identical nodes only (Co-identical join); from a logical viewpoint, it becomes 
an equivalence rule. 

^ Do nodes refering to the same entity necessarily have the same type? Since we 
do not want to tackle this question here, we consider simple ways of avoiding it: 
(1) an individual marker always appears with the same type; (2) coreferent nodes 
have the same type; (3) coreferent nodes are generic nodes (note however we could 
allow co-reference between individual and generic nodes with very slight changes in 
definitions). 
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— Co-reference addition: add a co-reference link between two generic con- 
cept nodes with same type. 

— Co-identical join: merge two co-identical nodes of the same graph, i.e. 
merge two co-referent nodes or two nodes with the same individual marker. 

The set of generalization rules is modified in the same way. A new rule is 
added (Co-reference deletion) and Detach is replaced by an equivalence rule 
(Co-identical split), splitting a node into two co-identical nodes. 

— Co-reference deletion: delete a co-reference link. 

— Co-identical split: split a node into two co-identical nodes with same label, 
i.e. split an individual node into two nodes with the same label, or split a 
generic node into two nodes with the same label related by a co-reference 
link. The union of the sets of edges incident to the two nodes is the edge set 
of the original node — one of the two sets may be empty. 

Now, projection has to respect co-identity, i.e. two coreferent nodes have co- 
identical images (either the same image, or distinct images which are coreferent 
nodes or nodes with the same individual marker). Note that previous results 
about simple graphs (theorems 1, 2 and 3) extend to simple graphs with coref- 
erence links. 



1.6 Note about Projection and Logical Completeness 

Our approach gives greater importance to projection than to the set of canonical 
formation rules; the price to pay is that graphs have to be put in normal form 
for ensuring logical completeness (w.r.t. let us recall that in [Sim98] an alter- 
native semantic W is proposed for which the completeness results hold without 
restriction) . Suppose one rather aims at providing complete formation rules with 
respect to without any restriction on the form of the graphs. Such a result 
can be achieved with the rules we propose in the following manner. 

Property 1. Consider the set of specialization rules (co-reference addition, 
relation simplify, restrict, co-identical join, disjoint sum) added with co-identical 
split. Then, given two simple graphs (possibly with coreference links) G and H, 
H can be derived from G if and only if d>{S),<P{H) N 'P{G). 

A similar property is obtained with the set of generalization rules added with 
co-identical join. These properties can be explained as follows. Let us consider the 
structuration of the simple graph space by the projection operation (a graph G 
is said to be more general than a graph H if G projects onto H). Let us consider 
two extremal transformations of a graph: the normal form (no co-identical join 
can be done) and the anti-normal form (see [GW95]: the graph is split into star 
graphs, composed of one relation and its neighbors, all distinct, related with 
CO- reference links; in other words, no co-identical split can be done). Following 
properties are immediate: 

— The anti-normal form and the normal form of a graph are unique (up to 
isomorphism) . 
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— If two graphs have the same normal form, they also have the same anti- 
normal form; to one normal form corresponds one anti-normal form, and 
reciprocally. 

— G projects onto its normal form and the anti-normal form of G projects onto 
G 

— All graphs with same anti-normal (or normal form) are logically equivalent 
by <P. 

If we possess both rules “co-identical join” and “co-identical split”, we can 
search the whole space of graphs bounded by an anti-normal graph and the corre- 
sponding normal form. Note that the projection completeness theorem (theorem 
3) could also be rewritten by replacing the normal form condition on the target 
graph H with an anti-normal form condition on the source graph G: 

Theorem 4. (Completeness) If G is in anti-normal form (i.e. no co-identical 
split can he done on G) and N ^(G) then G projects to H. 

To conclude this note, either one favours projection, and logical complete- 
ness of graphs operations (projection or an equivalent set of formation rules) 
is obtained with (minor) restrictions on the form of graphs; or one favours log- 
ical completeness of the set of formation rules, and the correspondence with 
projection is lost. 

1.7 Note about the Proposed Formation Rules in [Sow99] 

Previous rules present two advantages w.r.t. rules proposed in [Sow99]: logical 
completeness and simplicity. Specially, the equivalence rules of [Sow99], copy 
and simplify, could be decomposed into more elementary equivalence operations, 
respectively relation duplicate and co-identical split, and co-identical join and 
relation simplify (see Figure 3). 

Let us consider for instance the copy rule definition {simplify is the inverse 
of copy) : “the copy rule makes a copy of any subgraph v of u and adds it to u to 
form w. If c is any concept of v that has been copied from a concept d in u, then 
c must be a member of exactly the same coreference sets as d. Some conceptual 
relations of v may be linked to concepts of u that are not in v; the copies of 
those relations must be linked to exactly the same concepts of u” . Copy can be 
simulated by a sequence of relation duplicate (for all copied relations) followed 
by a sequence of co-identical splits (for all copied concepts) . 

Let us now come back to simple graphs without co-reference links, and to the 
basic deduction problem: given two simple graphs G and H, is there a projection 
from G to HI The next section shows that this problem is essentially the same 
as two other basic problems in databases and artificial intelligence. 

2 Projection, Conjunctive-Query Containment 
and Constraint Satisfaction 

Conjunctive queries constitute a broad class of frequently used queries in data- 
bases because their expressive power is equivalent to that of the Select-Join- 
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Fig. 3. Proposed equivalence rules in [Sow99] 



Project queries in relational databases [AHV95]. And conjunctive query con- 
tainment is a fundamental problem in database query evaluation and optimiza- 
tion. Constraint satisfaction problems are a class of combinatorial problems 
widely investigated in artificial intelligence. What do these problems have in 
common, and what do they have in common with projection between simple 
conceptual graphs? This section shows that the three problems are essentially 
the same problem, even if they are formulated in very different ways. 

The key notion explaining correspondences between these problems is that 
of a labeled graph homomorphism. Let us recall that a labeled graph homomor- 
phism from a labeled graph G to a labeled graph iL is a mapping from G nodes 
to H nodes that preserves edges and labels. We will see that the three problems 
can be formulated in terms of a labeled graph homomorphism problem. Such a 
result is not surprising in fact because the homomorphism notion is the funda- 
mental notion between relational structures (which are the more general algebra 
structures). The fundamental algebraic problem is the following: given two fi- 
nite relational structures A and B is there a homomorphism from A to B1 This 
problem can be immediately recast as a labeled graph homomorphism problem 
since a finite relational structure can be naturally formulated as a labeled graph 
of the same size (which in fact can be seen as a simple conceptual graph) . 

Let us call a simplified support a support S = {Tc,Tn,I) where Tc and 
Tr are provided with the trivial partial order, i.e. all elements of Tc and Tn 
are non comparable. For establishing equivalencies between problems we will 
consider simplified supports; at the end of this section, we will reconsider the 
introduction of partial orders on types. The Projection problem is given by a 
simplified support S, and two simple graphs in normal form G and H relative 
to S, and asks if there is a projection from G to H. As already noticed, since S 
is a simplified support, the question comes down to ask wether there is a labeled 
graph homomorphism from G to H . 

This section first presents polynomial transformations between Projection 
and Conjunctive query containment, then polynomial transformations between 
Projection and Constraint satisfaction. All mappings are parcimonious trans- 
formations (they preserve the set of solutions). Thus, algorithms for building a 
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solution or for enumerating all solutions can be directly transferred from one 
domain to the other one. 

2.1 Projection and Conjunctive-Query Containment 

Let us first recall some basic database definitions about conjunctive queries. 
We follow here notations of [AHV95]. Let R denote a set of n-adic relation 
names (n > 1) and let dom denote a countably infinite set of constants. A 
(positive and non-recursive) conjunctive query q is usually written as a rule, 
q = ans{u) ri(ui), ... r„(u„), n > 1, where r\ ... r„ belong to R, ans is not 
in R {ans stands for “answer”), u and ui ... Un are tuples of terms (variables or 
constants of dom), and each variable of u occurs at least once in ui ... Un. Notice 
that u may be empty. A database instance a on i? maps each p-ary relation of 
i? to a finite subset of dom^. Given a query q = ans(u) <— ri(ui), ... r„(u„) 
and an instance a on R, q(a) denotes the set of answers to q in a; q(a) is 
the set of tuples /j-(u) where ^ is a substitution of the variables of q by dom 
constants such that for any j in {!,..., n}, n{uj) € a(rj). When the arity of 

ans is 0, q(a) is the set {()} if there is a substitution /i such that for any 

j in {!,... ,n}, /u(uj) € a(rj) and otherwise 0. A query q is said to contain 

a query q' {q' C q) if, for any instance a on R, q'{a) C q{a). The Query 

containment problem is the associated problem: given two queries q and q' , does 
q contain q'7 It is well-known in databases that the query containment problem 
can be reformulated as a homomorphism problem, where a homomorphism is 
defined as follows. A homomorphism from q = ans(u) <— ri(ui), ... r„(u„) 
to q' = ans'(u') <— r{(u{), ... is a substitution 9 of the variables of q 

by terms (variables or constants of dom) such that 9{u) = u' and for any j in 
{1, ..., n}, there is i in {1, ..., n'} such that 9{rj{uj)) = r[{u{). The homomorphism 
theorem [AHV95] proves that, given two queries q and q' , q contains q' iff there 
is a homomorphism from q to q' . 



Gq 




q — ans{x, z) <— r{x, y, z), s{z, v), t(a) 



Fig. 4. Transformation from a query to a graph 



Transformation from Query Contaiment to Projection (Q2P). Let S = 

(Tc, Tr, I) be a simplified support built as follows. Tc is restricted to a single 
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type. Tr is equal to i?U Ans, where Ans = {ansk\ fc > 1} and k stands for the 
arity of ans. I is equal to dom. Elements of T/j are noncomparable. Then a simple 
graph (in normal form) Gq = {Rq, Cq, Eq, Iq) can be assigned to q as follows (see 
Figure 4 for an example). Rq is a set of (n+1) nodes labeled by ri, ..., r„, ansk- 
Cq is in bijection with the set of terms occurring in u, ui, ..., Un- The node of 
Cq assigned to a term e is generic if e is a variable and otherwise individual with 
the referent e. And if e is the z-th argument of Vj (or ans), then the concept 
node assigned to e is the z-th neighbor of the relation node assigned to Vj (or 
ans). Then, one can immediately check that, given two queries, q and q' , every 
homomorphism from q to q' is a projection from Gq to Cq ' , and reciprocally. 

Moreover, it can be proven that the query homomorphism theorem ([AHV95], 
6.2.3) is equivalent to the soundness and completeness theorem for projection 
between simple graphs (in normal form) [CM92], the minimization theorem 
([AHV95], 6.2.6) is equivalent to the theorem about irredundant simple graphs 
[CM92], and the complexity theorem for query decision problems ([AHV95], 
6.2.10) is equivalent to the complexity theorem for simple graphs decision prob- 
lems [CM92]. See [SCM98] for more detail about these equivalencies. 



Transformation from Projection to Query Containment (P2Q). A sim- 
plified support S = {Tc, Tfj, I) naturally induces a database; the database 
relation names are given by the union of Tq and Tr (the arity of an element of 
Tc is 1), and dom is a countable infinite set which contains the individual marker 
set I. A query qc is built from a normal simple graph G in a manner similar to 
^(G); more specifically, qa = ansQ <— ri(ui) ... r„(t6„), where ri(zzi) ... r„(u„) 
are the atoms of d>{G). Given two normal simple graphs G and H, it can be 
easily checked that any homomorphism from qc to qn is a projection from G 
onto H, and reciprocally. 

Theorem 5. The problems Conjunctive query containment and Projection are 
polynomially equivalent; furthermore by both transformations Q2P and P2Q, ev- 
ery solution to one problem is a solution to the other one. 

2.2 Projection and Constraint Satisfaction 

The input of a constraint satisfaction problem (CSP) is a set of variables, a set of 
possible values for the variables and a set of constraints between the variables. 
The question is to determine whether there is an assignment of values to the 
variables that satisfies the constraints. More specifically, a constraint network 
P = {X, G, D, R) consists of (see Figure 5): 

— a set of variables, X = {x\, ... ,x„} 

— a set of constraints, C = {C\, ... ,Cp} C V{X), 

When constraints are binary, i.e. involve two variables, the pair (A, G) is 
generally seen as a non directed graph, whose nodes are the variables, and 
edges are the constraints. In the general case, {X, C) is considered as an 
hypergraph. 
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— a set of variable domains, D = D\ U ... U Z?„, where Di is the domain of 
variable Xi. 

— a set of constraint definitions, R= {R\, ... ,Rp}; for every constraint Ci, let 

... Xi^) be any ordering of Ci variables, then Ri C Di^ x ... x Di^. 

A solution to P is an assignment of values to the n variables such that all 
constraints are satisfied, more formally: 

a mapping I: X ^ D 

Xi a G Di, 

and, for every constraint Cj = (xj., ... Xj^), {I{xj.^) ... I{xj^)) € Rj. 

A constraint satisfaction problem (CSP) is the associated problem: given a 
constraint network P, is there a solution to P? 





G and H are obtained from the constraint satisfaction network of Fignre 5 by the 

transformation C2P. 



Fig. 6. Transformation from CSP to Projection 



Transformation from Constraint Satisfaction to Projection (C2P). A 

CSP can be recast as a Projection problem by the following transformation, say 
C2P (see Figure 6). Consider a constraint network P = {X, C, D, R). Let us 
build a simplified support S = {Tc,Tr,I) from P as follows: Tc is restricted 
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to one element, say T, Tr is equal to C, each relation type having the same 
arity as the corresponding constraint, and / is equal to D. G and H are then 
defined as follows. The concept node set of G is equal to X. Each concept node is 
labeled by (T, *). There is one relation node for each constraint, labeled by the 
corresponding relation type. The z-th neighbor of a relation node is the concept 
node corresponding to the z-th variable of the constraint. The concept node set 
of H is D. Each concept node is labeled by (T, a), where a is the associated 
value of D. There is one relation node for each tuple of a constraint definition; 
this relation node is labeled by the name of the constraint, and its z-th neighbor 
is the concept node corresponding to the z-th value. Notice that both graphs G 
and H are in normal form. It can be easily checked that any solution to the CSP 
is a projection from G to H, and reciprocally. 



Transformation from Projection to Constraint Satisfaction (P2C) In 

turn, a Projection problem can be recast as a CSP by the following transforma- 
tion, say P2C . There is one variable for each concept node of G, and one con- 
straint for each relation node of G. The set D of domain values is built from the 
concept node labels of H: there is one value for each distinct label in H. The do- 
main Di of a variable Xi is the subset of Z? to o priori possible images for the con- 
cept node Ci by a projection from G to H, i.e. Di = {I = (t, m) s.t. label{ci) > /}. 
The definition of a constraint Gi coming from a relation node is the set of 
tuples given by the relation nodes of H with same label as r^, i.e. if Gi is a con- 
straint over variables ... Xi^), then its definition is Ri = ... where 

li^ = label{cij) s.t. ... Ci^ are neighbors, in this order, of a relation node r in 
H with label{r) = label{ri)}. It can be easily checked that any projection from 
G to iZ is a solution to the CSP, and reciprocally. 

Theorem 6. The problems CSP and Projection are polynomially equivalent; 
furthermore by both transformations C2P and P2C, every solution to one prob- 
lem is a solution to the other one. 

Note about general supports. What if we consider simple conceptual graphs 
on general supports instead of simplified supports? The translations C2P and 
Q2P from CSP and Conjunctive query containment to Projection still hold, 
since a simplified support is a support. The translation P2C can be adapted in a 
straightforward way: the tuples of the definition of a constraint Ci coming from 
a relation node of G are given by relation nodes of H with a label less or 
equal to the label of (instead of equal to). For P2Q, G and iZ on a support 
S can be translated into G' and ZZ' on a simplified support S' such that every 
projection from G to ZZ is a projection from G' to H' , and reciprocally. See for 
instance [Bag99]; the size of the new input {S', G' , H') is linear in the size of 
the old input {S, G, ZZ). 

Relationships with published works. The translation Q2P has been pub- 
lished in [CMS98]. A first version of the correspondences C2P and P2C can be 
found in [CM95] (not published) or in [MC96] (in French). In these previous 
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versions, correspondences are done between labeled graph morphism and binary 
CSP, but they rely on the same ideas. Independently, in [KV98] it is shown 
that conjunctive query containment and CSP are essentially the same problem 
because they can be recast as a relational structure homomorphism. 

3 The Constrained Derivation Model 

It would be out of the scope of this paper to present in detail this extension to 
the simple graph model (see [BGM99]). The aim of the following presentation is 
rather to illustrate our approach. The constrained derivation model is composed 
of three kinds of objects: simple graphs, rules and constraints. Simple graphs are 
the basic constructs from which rules and constraints are defined. Projection is 
the basic operation from which reasonings with rules and constraints are defined. 

— In this model, a simple graph is interpreted as a fact or a query. 

— A rule allows one to add new knowledge. It is composed of an hypothesis 
and a conclusion, and is used in the following classical way: given a simple 
graph, if the hypothesis of the rule projects to the graph, then the informa- 
tion contained in the conclusion is added to the graph. Rules are split into 
production rules and transformation rules. 

— A constraint defines conditions for a simple graph to be valid. It is composed 
of a condition part and a mandatory part. Roughly said, a graph satisfies 
a constraint if for every projection of its condition part, its mandatory part 
also projects to the graph. 

The deduction problem in this framework can be outlined as follows: given a 
knowledge base (composed of facts, production rules, transformation rules and 
constraints) and a query, is it possible for this knowledge base to evolve in 
such a way that the query is satisfied? More specifically, facts describe an initial 
world. Production rules complete world descriptions. Transformation rules define 
possible transitions from a world to other ones. Constraints are used to check 
validity of worlds. A successor of a valid world is obtained by a single application 
of a transformation rule on this world. Solving the problem is to find a chain 
of valid worlds evolving from the initial one such that the query projects to the 
last one. 

All examples of this section are taken from the modelization of the Sysiphus-I 
case study (a problem of allocation of offices to members of a compagny) with 
the constrained derivation framework [BGM99]. 

3.1 Rules 

A rule “If H then C” is classically defined as a couple of simple graphs, H and C, 
respectively called hypothesis and conclusion of the rule, sharing some generic 
concept nodes (called here frontier nodes). Briefly said, the logical semantics 
is extended to rules in the following way: let x\ ... Xp be the variables assigned 
to frontier nodes; let and <P'{C) be formulas obtained from 'P{H) and 
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<P(C) by removing existential quantification of ... Then, given a rule R, 
^{R) = Wxi ... Vxp ^'{H) ^'{C). 

[Bag99] provides a graphical visualization of a rule as a simple graph provided 
with a coloring of its nodes with two colors, say 0 and 1. In drawings, 0-colored 
nodes are painted in white, and the others in black. The subgraph induced by 
the color 0 nodes must be a syntactically correct simple graph. Nodes with color 
0 are the hypothesis nodes and make up the hypothesis of the rule. Concept 
nodes of color 0 with at least one neighbor outside the hypothesis part are the 
frontier nodes. Nodes with color 1 are the conclusion nodes and, together with 
the frontier nodes, they make up the conclusion of the rule. See in Figure 7 rules 
i?i and i ?2 expressing that “near” is a reflexive and symmetric relation. 



A simple graph G A rule R1 A graph immediately derived The closure of G w.r.t. {R1, R2} 




Fig. 7. Rules, derivation and closure 



A rule R is applicable to a simple graph G if there is a projection, say II, 
from the hypothesis of R to G. In this case, the result of the application of R 
to G following n is the simple graph G' obtained from G and the conclusion of 
R by restricting the label of each frontier node c in the conclusion to the label 
of its image 77(c) in G, then joining c to 77(c). In this case, we say that G' is 
immediately derived from {G,R) (see Figure 7). Let 77. be a set of rules. G' is 
said to be immediately derived from {G,IZ) if there exists a rule 7? in 77 such 
that G' is immediately derived from (G,R). A graph G' is said to be derived 
from {G,TZ) if there exists G = Go,Gi,...,Gfc = G' such that each Gi+i is 
immediately derived from (Gj,77). 

Theorem 7. (Soundness and compleness of forward chaining) Let G and 77 he 
simple graphs and R he a set of rules, all relative to a support S. 

— if H is derived from {G,IZ) then<I{S), ^(77), ?7(G) N <?(77) 

— provided that G and all rules ofIZ are in normal form, and the graph obtained 
at each derivation step is put into normal form: if<P{S), ^(77), d>{G) N 

then there exists a simple graph 77' that can he derived from (G, 77) such that 
77 projects onto 77'. 

Proof. See [SM96] . Note that [SM96] also defines backward chaining operations, 
and proves associated soundness and completeness results. 
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3.2 Constraints 

We consider positive and negative constraints. A positive constraint expresses 
that “if information A is present, then information B must also be present”. A 
negative constraint expresses that “if information A is present, then information 
B must be absent” . Formally, a constraint is defined in the same way as a rule, 
as a simple graph provided with two colors, 0 and 1. Nodes colored in 0 form the 
condition part of the constraint. A given graph G satisfies a positive constraint 
C if every projection from the condition part of C into G can be extended to 
a projection of the whole graph G into G. And G satisfies a negative constraint 
G if no projection from the condition part of G into G can be extended to a 
projection of the whole graph C into G. A graph G is said to be valid w.r.t. a 
set of constraints C if it satisfies all constraints of C. 



A positive constraint C1 



A negative constraint C2 




Fig. 8. Positive and negative constraints 



Fig. 8 presents two constraints: a positive constraint Cl, expressing that “a 
boss office should be near all offices” (more litterally: “if a boss is in an office 
and a person is in an office, then this latter office must be near the boss office”), 
and a negative constraint, C2, expressing that “a smoker and a non smoker 
can not share an office” (the condition part is here empty). Note that more 
generally we could restrict negative constraints to negative constraints with an 
empty condition. Indeed, given a constraint G and a graph G, the statement “no 
projection from the condition part of (7 to G can be extended to a projection of 
G into G” is equivalent to “there is no projection from G to G” . Also note that a 
more general form of constraint can be handled, corresponding to the following 
intuitive semantics “if information A is present and information B is absent then 
information C must be present and information D must be absent” . It is defined 
as a simple graph provided with four colors, one for each part A, B, G and D. 

3.3 Constrained Derivation 

Consider that asserted facts (represented by a simple graph F) and production 
rules (represented by a set of rules TV) describe a world, say W. The associated 
deduction problem (given a query Q and a world W, is Q deducible from IF, 
i.e. does there exist a derivation from (F, IZ) to a graph F' such that Q can be 
projected into F'7) is a semi-decidable problem [CS98]. Let us restrict ourselves 
to decidable cases. Such a case occurs for instance if we restrict production rules 
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to rules that do not add new concept nodes (the conclusion of a rule is composed 
of frontier nodes and relation nodes; or in logical terms, the formula assigned to 
the conclusion does not add new variables). It is then possible to compute the 
closure of F w.r.t. TZ. Intuitively, a graph G is closed w.r.t. TZ if no application 
of a rule of 7^ to G adds information. More formally, for any rule R of TZ, if R 
can be applied onto G (there is a projection 77 from the hypothesis of R into G) 
then the resulting graph is equivalent to G (77 can be extended to a projection 
of the whole graph R into G). Given a graph F and a set of rules TZ, a closure 
of F w.r.t. 7?. is a graph derived from {F,TZ) closed w.r.t. TZ. All closures are 
equivalent for projection; we call the closure of 7^ by 7?. the (unique) smallest 
graph of this set (see Figure 7). Now, a world W = (F,TZ) is said to be valid 
w.r.t. a set of constraints C if the closure of F by 77. is valid w.r.t. C. 




Fig. 9. Transformation rules 



Let us add a set of rules, called transformation rules, whose role is to generate 
new worlds. Transformation rules are rules as defined above but the intuitive 
semantic becomes “if information A is present in a world, then generate a new 
world by adding information B to this world” . See Figure 9: the rule on the left 
says “if there are a person and an office, try to put that person into that office” . 

A more general form of transformation rule can be handled, with hypothesis 
split into a positive part ( “if information A is present” ) and a negative part ( “and 
if information B is absent”). Such a rule is encoded by a graph with three colors. 
It is applicable to a graph if there is a projection from the positive part of the 
hypothesis onto the graph that cannot be extended to a projection of the whole 
hypothesis (“A is found but we cannot find B”). See Figure 9: the rule on the 
right says “if there are a person who is not already assigned to an office, and an 
office, try to put that person into that office” . Note that it would not be possible 
to generalize in the same way production rules because the derivation mechanism 
would become non monotonic (for instance the “unique closure” property would 
not hold true anymore). 

Let W = {F, TZ) be a valid world w.r.t. a set of constraints C and let T be a 
set of transformation rules; a successor of IF by F is a valid world obtained by 
appling one rule of T onto the closure of F by 77. A world W' is a descendant 
of a (valid) world IF if there exists a sequence IFo = IF ... Wk = W such 
that each IF+i is a successor of IF, 0 < z < A:. Such a sequence is called a 
constrained derivation from IF to IF'. The global deduction problem can be 
expressed as follows. The input is a world IF = (F, 77), a set of constraints C, 
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a set of transformation rules 'T and a query Q. The question is whether there 
exists a constrained derivation from IT to a world W such that Q projects to 
W . Note that every world of a constrained derivation has to be valid, including 

W. 

Finally, let us point out that the constrained derivation model could handle 
nested graphs [CMS98] instead of simple graphs. 

4 Strengths and Weaknesses 

of This Graph-Based Approach 

Let us first answer some questions about our approach: why are we representing 
knowledge with labeled graphs, and specially with conceptual graphs? Why are 
we doing reasonings with graph operations instead of logical operations? More 
specifically, why are we basing reasonings on graph homomorphism and not on 
diagrammatic rules a la Peirce? 



Why labeled graphs? Different kinds of labeled graphs have long been used 
to represent knowledge. One of the more attractive features of labeled graphs 
in general for knowledge acquisition and representation is their visual aspect. 
Basically a simple conceptual graph is a bipartite labeled graph, where one class 
of nodes represents entities and the other represents relationships between these 
entities. It is thus easily editable and interpretable by a person, in particular if it 
has a small number of nodes and edges or if it has a special structure enhanced by 
the drawing. As noticed in [BBV97] for instance, in a knowledge modeling con- 
text, “Experts appreciate graphic notations. The very simple graphic concepts 
of Conceptual Graphs (boxes, ovals and arrows) and the use of labels chosen 
in the expert language make modeled knowledge understandable by the expert 
with no effort, whereas they were afraid by textual formal languages”. 

Why conceptual graphs? Several important features distinguish conceptual 
graphs from other labeled graph representations: 

^ a clear separation is made between ontological knowledge and factual knowl- 
edge; 

— they are not limited to binary relations; 

— they are provided with a logical semantics. 

Why graph-based reasonings? Conceptual graphs are not only a visual lan- 
guage but also a formal knowledge representation model. They are provided with 
reasoning operations which are sound and complete with respect to deduction in 
first order logics (see above for simple graphs, [CMS98] for nested graphs, and 
[Sow84][Wer95][KS97] for more general graphs equivalent to first order logic). 

One commonly held opinion is that conceptual graphs are — only — a graph- 
ical interface for logics. Indeed, a justified question is to ask why not simply 
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translate conceptual graphs into logics, use logical techniques for doing reason- 
ings, and come back to graphs for presenting results. A related question is why 
base reasonings on graph homomorphism (projection) rather than on elementary 
graphical rules (canonical formation rules for simple graphs and diagrammatic 
rules a la Peirce for general conceptual graphs). The main argument justifying 
graph-based reasonings is related to knowledge modeling. Another argument is 
related to computational aspects. 



Benefits from a modeling viewpoint. Let us summarize main qualities of 
the model we propose from a modeling viewpoint (for a more detailed discus- 
sion, see [BGM99]). Objects are easily understandable by an end-user as well as 
reasonings^ and this for two reasons: projection is a graph-matching operation 
easily interpretable, and the same language is used at interface and operational 
levels. This property can provide a precious help to verify the validity of an 
expertise modeling for instance. Indeed, it is in general very difficult to ensure 
that the expert reasoning is correctly modeled [GS95]. Our model can aid by 
giving the expert the possibility to run the system and to follow reasonings step 
by step in order to check that what is done is correct from his/her viewpoint. 



Benefits from a computational viewpoint. Having knowledge represented 
by graphs and reasonings based on graph theoretic notions allows one to moor 
to combinatorial algorithms, and to benefit from results in this domain. Also, 
using graphs whereas the dominating formalism is logic^ provides a different 
viewpoint. For instance, just to give some immediate notions, a path, a cycle or 
a distance are natural notions on graphs and are not in logics. See for instance the 
algorithm for backward chaining of production rules based on the graph notion 
of a piece in [SM96] or the complexity result in [BMT99] (the deduction problem 
of a description logic, whose complexity was unknown, is shown to be solvable 
in polynomial time by a translation into a graph homomorphism problem) . 



Limitations. One question we have been exploring can be summarized as fol- 
lows ”how far is it possible to go in knowlegde representation and reasonings 
with labeled graphs and graph matching operations (with graph homomorphism 
in the foreground)?”. Indeed, the main hypothetical limitation of our work is 
expressivness. At present time we are not able to give a definitive answer to this 
question, but can provide some partial results. 

Simple graphs represent positive and conjunctive knowledge. And projection 
corresponds to deduction on this knowledge. It seems that, taken alone, this 
kind of operation (that compares two graphs by a global matching, instead of 
trying to obtain one from another by a derivation sequence) is fundamentally 
unable to deal with negation or disjunction in a complete way, even limited 
as shown below. Suppose for instance one wants to deal with difference links, 

^ and we do no intend here to question this pre-eminence; logic is naturally the refer- 
ence formalism as soon as one aims at modelizing reasonings. 
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expressing that two entities represent distinct entities (the contrary of a co- 
reference link). A difference link is logically interpreted by the atom -<(x = 
y), where x and y are the terms assigned to concept nodes extremities of the 
difference link. One may represent difference as a relation type, say dif with 
associated production rules giving it the semantics of -'equal (yxVydif{x,y) — >■ 
dif{y, x) and \/x\/y\/zdif{x, y) A (y = z) — 1 dif{x, z)). 




^{G) implies ${H), but there is no projection from H to G 
Fig. 10. Projection and difference links 



In Figure 10 difference links are represented by crossed lines. d>{G) implies 
d^{H) but there is no projection from H to G. Indeed, ^(G) = 3x3y3z r{x, z) A 
r(y, z) A -i{x = y) and d>{H) = 3x3y r{x, y) A -i(x = y), and by adding to 
^(G) the subformula {{x = z) V ~^{x = z)) A ((y = z) V -i(y = z)), one keeps an 
equivalent formula from which ^{H) can be trivially deduced. 

bel r. 

One approach for obtaining a model equivalent to FOL (or more), while 
keeping projection as the basic operation is that of [KS97]. This work proposes a 
sound and complete reasoning method for general conceptual graphs, based on 
tableaux, which decomposes conceptual graphs into simple graphs, and combines 
projection onto simple graphs with tableaux rules. 

The constrained derivation model outlined in this paper clearly allows some 
forms of negation and disjunction, while keeping objets and operations easily 
interpretable. The study of its expressivness is not achieved yet. 
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Abstract. We describe a novel inference method to help us deal with the 
problem of having sparse information about the user. The information we will 
have available for the user will form his/her model. That model will be 
represented in a conceptual graph format. We will also have gathered 
information about categories of users that share characteristics, preferences and 
interests. These will form the "prototypes", as we call them and will be 
represented in graphs as well. Because we will know significantly more about 
the prototypes, they will be the source where we will try to get the information 
we want for the user. This method has been implemented in ERIE [2, 3, 10]. 



1 Introduction 

In the past software systems used to be relatively simple because technological 
capabilities were limited. They tended to be appropriate for performing very specific 
tasks only. Subsequently, a limited number of people were using them; the ones that 
had relevant background and needed to use them. However, nowadays things have 
changed: systems have become comparatively complex and the users that are meant 
to use them form very diverse user groups. They have different characteristics, needs, 
abilities, preferences and interests. As a result, software systems have to become more 
individualised and cater for those differences and not treat all the users in the same 
manner. Their scope is to make man's life as easy as possible. 

User modelling [9, 12, 14] can be identified in general terms as the part of the 
software system's design that deals with the aforementioned problems. Since 
nowadays software systems structure is becoming more and more agent-based, we 
can safely assume that user modelling will be taken care of by either a dedicated 
agent, or by an agent's component. 

Intelligent Agents can be defined as Personalised Assistants that “look over the 
user’s shoulder” and learn the user’s characteristics in order to act for his interest [13, 
16]. Indeed, the user’s characteristics, which of course will be related to the software 
system’s purpose, form the core element of the individualisation. This information is 
called the user’s model. The user’s model, in combination with the relevant lA 
knowledge base will be used by the Intelligent Agent to predict the human responses, 
the human needs and suggest to the user things to do, help the user utilise his 
computer more efficiently. 
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There are two problems related to this procedure. The first is concerned with the 
knowledge representation of the user’s model, how we build the model, how we use 
the model with the other lA knowledge bases to make inferences and decisions. The 
second one is concerned with the fusion problem of individual inferences. 

In this paper we will mainly concentrate on the first problem and discuss a form of 
knowledge representation and inference method, which is novel and new. We will 
then discuss the fusion problem and give a solution, which is suitable for our case but 
not unique. Examples will be given throughout the analysis of the method. 



2 Knowledge Representation of the User Model 
and of the Prototypes 

It is important to recognize that we will only have sparse information about a 
particular user and while this will build up over time, we should not expect to be able 
to use learning methods that require large amounts of data to provide the user model. 
Essentially, this means that we need to find a way to make inference when we only 
have sparse data available. 

In human terms we do this all the time. We meet someone and make decisions 
about that person with reference to a set of prototypical persons, which we have built 
over time. For example, if we knew someone that worked in a software house, we 
could assume with a certain degree of confidence that he would know how to 
program. We should collect data for building up clusters of types of users, according 
to their behaviour, their abilities, needs, etc and use these clusters as prototypes. The 
way in which the system acquires relevant information to build them is a separate 
problem. It can produce them by extracting information from a database of 
individuals, or the user can provide it with his understanding of prototypes or refine 
existing ones. For our purpose, we will assume that we have a collection of 
prototypical people. 

As for the user's model, the easiest way would be to ask him directly to pick a 
model (categorise himself) from an existing model database. That would leave 
everything to the user. Another way would be to ask the user for relevant information, 
possibly giving him a set of choices for each characteristic we are interested in. 
Finally, the most advanced method and at the same time the least demanding from the 
user's point of view, would be creating the model by learning the necessary 
information. 

The next question that we need to answer is how can the system acquire relevant 
information about the user. By observing the user's real-time interaction with the 
system, we can learn quite a lot about him, his preferences, his interests, his habits, 
etc. Additionally, all the systems that use user-profiling techniques, should give their 
user a certain degree of freedom as well as allow him to get involved with improving 
the agent's performance. Consequently, the user should be able to give feedback 
concerning the system's actions and should be able to improve his own profile at any 
time. 

The representation of the clusters/ prototypes we mentioned earlier can be done in 
different ways. One approach would be to represent each person in the cluster as a 
vector point and to find the average vector for the cluster and this would represent our 
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prototype for the cluster. Having a collection of prototypes represented in this way, 
when presented with a particular user, we would match him/her to the nearest 
prototype vector. However, we consider this to be a rather simplistic form of 
representation because first of all it does not allow us to consider relationships 
between concepts, but only basic attributes. In addition, there is a lack of flexibility as 
a result of having just a single point representing a prototype. Consequently, we 
decided to use Conceptual Graphs that satisfy both of the above requirements. In this 
way, we will be able to represent a cluster as a set of attributes and their relations. 
Conceptual graphs are described in greater detail in a following section. 

Let us give an example to illustrate the difference between the two above forms of 
representation and identify the advantages of the second one. Supposedly, we had to 
represent a cluster of almost perfect circles. One way to achieve this would be to take 
a prototypical circle with radius r and centre c and create the vector (r, c). We would 
then accept a circle as a member of this cluster, if its average radius and centre lie 
close to r and c. A second way of representing the same cluster would be by using 
fuzzy sets [11]. Specifically, by defining the radius as a fuzzy set f and the centre 
having co-ordinates (gx, gy), where gx and gy are two fuzzy sets. The fuzzy sets f, gx 
and gy will thus define a family of circles with varying membership. We could 
describe this family as a fuzzy circle and this will constitute our cluster of acceptable 
circles. To decide whether a new almost perfect circle belongs to this cluster, we 
would have to check whether it lies in the aforementioned fuzzy circle. Relations 
between attributes can be defined as well. In our example, having the attributes 
minimum diameter and maximum diameter of a circle, we can define the difference 
between the two, which will have a maximum magnitude for accepted circles. We can 
then say that for any almost perfect circle, this difference must be g where g is a fuzzy 
number. Conceptual graphs are suitable for efficiently describing attributes as fuzzy 
sets and relations between them. 



3 The Philosophy Behind Our Approach 

We will collect information on users based on their interaction with the computer and 
will divide them into clusters. The latter will be generic categories of people with 
similar behaviour and characteristics. Those characteristics will be captured in the 
cluster's definition. Each cluster will be defined by the relevant fuzzy set attributes 
and relationships between them and will be represented by a conceptual graph. This 
will form an individual prototype. A new person will be checked against this 
prototype to decide how closely it matches it. He/she will also be checked against all 
the other prototypes that include information that is of interest. By doing this we 
assume that a user that partially matches a certain prototypical person’s behaviour, 
will probably possess other features of that specific prototype as well. The computer 
will be able, when missing some information about its user, to deduce it from 
prototypical users with similar behaviour. This form of reasoning will be inductive or 
even analogical and will have no truth guarantee. 
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4 Conceptual Graphs 

Conceptual graphs [15] are finite, connected, bipartite graphs with concept and 
relation nodes. Concept nodes have a label and a referent field, whereas relation nodes 
have only labels. The concept node’s referent field is the concept’s instantiation and 
can either be a value, a set of values or a fuzzy set [4]. The relation nodes’ role is to 
connect the concept nodes and to represent the relationship between them. Conceptual 
graphs are related to semantic nets and overcome some of the difficulties such as the 
“isa” problem, which were found in early uses of the latter. Another major advantage 
that can be easily identified is that these graphs are closely related to natural language. 
Every English sentence can be represented with a conceptual graph and every graph 
corresponds to an English sentence. Here is an example graph that illustrates the 
aforementioned notions. 




FUNCTION_TYPE space 
Fig. 1. Example conceptual graph 

The graph above is a graph representing the sentence: “ Bill chooses advanced 
functions”. We can see that concepts USER and FUNCTION are instantiated to single 
values “BILL” and “K” respectively, whereas “FUNCTION_TYPE” to the fuzzy set 
“advanced”, which is defined at the “Function type” space. The relation nodes 
“Chooses” and “Of_type” are there to connect the previously mentioned concepts. 



5 Outline of the Inductive Inference Mechanism 

In this section we will discuss the preliminary ideas for inference with respect to a 
certain user based on prototypes represented as conceptual graphs. At this point, we 
will not refer to the source of the information used. We will assume that after 
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Studying people in a certain context, we derived a set of prototype conceptual graphs. 
People with similar behaviour or characteristics, depending on the context, were 
clustered to the same graph. Furthermore, the user’s graph is developed after 
collecting relevant information about him. This information can be acquired in several 
ways: observation of the real-time interaction between him and the system, some 
direct user-feedback concerning the system’s actions, or by other means which will 
not be dealt with in this report. In this report we will refer to the prototype graphs as 
PI, P2, ... Pn and to the user’s graph as S. Examples of these can be seen in figures 2 
and 3: 




Fig. 2. Prototype graph PI and PP (within dotted line) 

Prototype graphs PI, P2 will be representative of the prototype graph category. In our 
example we will only use two graphs for illustrative purposes. However, in reality 
there will be a significantly greater number. 

As we mentioned above, the computer will not always have all the necessary 
information on a particular user in order to take action on his behalf. Its knowledge 
will be limited, so it should make inferences from the information it has available. If 
the user were one of the prototypes, then the answer would be given by part of the 
respective graph. 
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Fig. 3. Prototype Graph P2 and P2' (within dotted line, top) and User Graph S (bottom) 
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The information that the computer will be looking for, or in other words the 
information that will be missing from the user’s graph, could be expressed in a 
query/question format when using natural language. For our example, let us suppose 
we would like to answer the question: “Flow quickly does the user act when dealing 
with important tasks?” Since every natural language sentence can be represented with 
a conceptual graph, we do this for our question and we end up with the graph Q in 
figure 4. To construct Q, we must use concept and relation node labels that exist in 
the prototypes as well, in order to achieve some matching. The information required 
will be concept nodes with empty referent fields (ACTION_SPEED in this case). 




Fig. 4. Question/ Query Graph Q 



We will match graph Q to each of PI, P2...Pn - this will correspond to a maximal 
join operation - all nodes in Q should match corresponding nodes in PI, P2...Pn. If 
nodes exist in Q that do not find matching nodes in a certain prototype graph, then 
that graph is not capable of providing us with the information we are looking for and 
so we do not consider it any further. After the maximal join, in the resulting graphs, 
the concepts with the empty referent fields will be instantiated. These graphs will look 
like the sub-graphs defined by the dashed line drawn on PI and P2 with the 
appropriate instantiations. For each graph, we identify the sub-graphs that correspond 
to the answer and strip the remaining concepts and nodes. The new graphs obtained 
will be called PT, P'2...P'n. At this point we need to mention that the stripping 
operation would be more efficient if we kept parts of the graph that might be relevant 
to our question. Looking at the main concept of the question graph and keeping sub- 
graphs that directly support it, can possibly identify these parts. This will further be 
analysed in following reports. 

The graph S corresponding to the specific user is also treated in this way. Q is 
mapped onto S and S is stripped of non-relevant nodes to give S'. This will look like 
the sub-graph defined by the dashed line in S, with an additional ACTION_SPEED 
concept with empty referent field. S of course will not contain the sub-graph 
corresponding to the answer, otherwise we would not be referring to the prototype 
graphs to obtain this information. 
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We now take S' and match onto P'1, P'2...P'n by performing maximal join 
operations. This results to graphs P"l, P"2...P"n. These maximal joins will not be 
complete joins and some measure of completeness will be used to give a support for 
how well S' matches each of P'1, P'2... P'n. When we talk about matching two graphs, 
we mean matching one's relations and concepts against the other's. Two relations 
match when their type is the same. Two concepts match with a support s=l when their 
types and referents are identical and with a support s<l when their types are the same, 
but their referents different. Since we consider the case where referents can be fuzzy 
sets as well [4], in that specific case we perform Point Semantic Unification [1], 
which is based on the Mass Assignment Theory [5], to obtain a support. Because we 
get a support from each pair of concepts that is matched, we accept the overall 
support to be the conjunction of the individual supports. Consequently, we end up 
with a support for each pair S' and P'n. Let these supports be given by si, s2...sn. In 
our example, the information we would get from P"1 is that “Nick deals with non- 
important tasks soon” with a support si and from P"2 that “Nick deals with important 
tasks immediately” with a support s2. We can now pick out the parts of P"l, P"2. . .P"n 
that correspond to our answer. Let these answer graphs be Al, A2...An. We do this 
by projecting P"l, P''2...P"n on to our query graph and identifying the part of the 
former that does not project on anything. In our example they will just consist of the 
ACTION_SPEED concept instantiated to one of the fuzzy sets of the respective space 
(Fig. 3). 

At this stage we have a set of answer graphs Al, A2...An with a support for each 
one. It is now necessary to fuse these to obtain a final answer graph A. This will be 
done with a combing scheme, based on the Mass Assignment Theory [5], that will 
take into account the supports as well. Essentially, our answers will have the form of 
fuzzy sets with accompanying supports. If we wanted to combine for example fl and 
f2 with supports si and s2 respectively: 

ExpLeastpr ejudicedDi stribution (fl) = i’l * LPD(fl) + (1 — i’l) * LPD(fl) = fl' 
ExpLeastprejudicedDistributionifl) = s2* LPD(f2) + (1 — s2) * LPD(f2) = f 2' 

ffinal=n'r^f2' 

The answer graph A will contain the information required from the computer to act 
on the user’s behalf. The feedback given by the user can be potentially used to adjust 
and improve the combining schemes used to match the user’s preferred choice. 

At this present time, a Conceptual Graph Toolkit software package has been 
implemented in ERIE [2, 3, 10], which allows graphs to be stored in linear format and 
us to perform several operations such as maximal join, simplification, restriction, etc. 
Eurthermore, the whole inference mechanism described above has been developed. 



6 Application - "The Forum" 

Researchers at BT's Adastral Park have developed the "Eorum" [11], which is an on- 
line collaborative virtual working environment [6] aiming to bring people together 
both informally and formally. It is designed to allow people who should meet each 
other do so easily and naturally and provides the means for them to have richer on- 
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line meetings. It is divided in two parts: the Contact Space and the Meeting Space. 
We will focus on the first. The Contact Space consists of a number of zones that 
represent subject interests. According to the user's "overall" interest at a particular 
moment, his avatar is placed in the appropriate zone along with other users' avatars. 
An avatar can chat to any other on the same zone. 

Our aim in this project is not only to categorize a certain user based on his 
interaction with the computer and subsequently place his avatar on the relevant zone. 
That can be achieved in several different ways. We would propose one related to the 
mechanism described previously, however we will not get into it in detail in this 
paper. Our aim is to make the whole system act a bit more intelligently. A system is 
not necessarily intelligent simply because it knows e.g. all the symptoms of 1000 
diseases. It must be able to make rational diagnoses based on the presence and 
absence of different combinations of them. In the Contact Space case, one possibility 
would be to need to make inference about a user not knowing everything about him, 
e.g. how much is he interested in interest X. Another would be to have to suggest to 
someone to go and "chat" with person Y because they have a common interest, apart 
from the "zone's" interest. We will now analyze some ideas on the way our novel 
inference method can be applied in the "Forum". 



7 Applying Intelligent User Modelling to "the Forum" 

Our target is to create prototype user models, which will constitute the different user 
categories; the user groups in other words. After “constructing” those models and a 
similar user’s model, we will be able to identify similarities between them, which will 
allow us to draw conclusions. “What should a prototype consist of’, is a question that 
can be answered in a lot of different ways depending on the information we have 
available and the scope of our system. It is very crucial to find the right combination 
of pieces information to form the prototype user models. Two requirements need to be 
satisfied in every case. To take advantage of the information gathered from different 
sources, so that important information is not left out, and at the same time not to 
include too much detail that would make the system too complex and less flexible. 

Initially, we need to identify the different sources of information for a particular 
user. These can be the topics the user works on, the applications he/she uses, the 
topics he/she searches for on the Internet and his/her interest areas based on Jasper [7, 
8]. This kind of information can be easily collected and summarized. It can be 
generalized in a way that nothing important is omitted and everything that is not of 
great interest is left out. In any case, the information used should be enough to 
distinguish one prototype from another. 

Each prototype can represent a different zone in Contact Space. That means that 
we will have a “Software Engineering” prototype, a “Wearable Computing” 
prototype, a “Distributed Computing”, a “Shared Virtual Worlds”, etc. Each one of 
them will basically include the characteristics of a person that would definitely belong 
to that category. Eor example a “Shared Virtual Worlds” prototype user will be 
someone whose interests are “Virtual Reality”, “Animation”, “Computer Games”, 
who works on “VRML” and “C++”, who searches for “avatars” and “navigation in 
virtual space” on the Internet and who uses applications such as “The Eorum” and 
“Visual C++”. 
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Fig. 5. Example Conceptual Graph (top) and Abstract User's Graph (bottom) 
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As we specified in the inference method description, we are going to use 
conceptual graphs as a method of knowledge representation. These graphs will have 
concepts instantiated to fuzzy sets in order to be able to obtain a support when 
performing the matching later on. In our case, the concepts nodes will be key phrases 
as the ones mentioned above, with instantiations to fuzzy sets that will show the level 
of interest to that particular topic. The relation nodes will represent the source of 
information. In figure 5 (top) we can see an example conceptual graph and the fuzzy 
sets defined in LEVEL_OF_INTEREST space. 




Fig. 6. Query Graph (top) and Prototype Graph 1 (bottom). Concept and relation 
nodes in dotted lines are stripped 
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A collection of graphs with a similar architecture to the one of graph in figure 6 
would play the role of the different user groups. The user himself will have a graph 
that will encapsulate his characteristics. At that point, we can actually say that we 
have all the necessary information in an appropriate for manipulation format. We are 
now able to perform several operations. 

One is to compare the user’s graph to all the prototypes and find a degree of 
matching to each one. That will give us the ability to say in which category and thus 
in which zone the user belongs to, having his specific attributes in mind. The other 
probably more useful operation we can perform, as analyzed in the proposed 
method’s description, is to infer things about the user that we don’t know. 
Supposedly, we wanted to find out how interested is user K in interest topic 5 (INT5 
in graph). We would hypothetically have graphs that would resemble the ones shown 
in figure 5 (bottom), 6 and 7. 




Fig. 7. Prototype Graph 2. Concept and Relation nodes in dotted lines are stripped 

By following the procedure described previously, we “strip” both the user graph 
and the prototype graphs from any irrelevant sub-graphs having the query graph as 
our guide. The irrelevant parts can be seen in dotted lines. We then match the 
resulting user’s stripped graph on to each stripped prototype graph. Obviously some 
concepts match and some do not. The ones that match will probably match with a 
certain degree/support, which will come from combining their fuzzy sets (point 
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semantic unification [1]). An overall support per prototype-user pair can be obtained 
by means of conjunction. 

The answer, which in our case would be the instantiation of the INT5 concept, can 
be found by projecting the query graph on the prototypes. Consequently, we end up 
with a fuzzy set as an answer and a support associated with it, for every prototype we 
considered. We can combine these using a combination scheme based on Mass 
Assignment theory that combines probability distributions taking supports into 
account. In our example, we would get “medium” as an answer from prototype 1 with 
a support S 1 e.g. and “high” from prototype2 with a support S2. The combination of 
these two would result to a fuzzy set defined on INTEREST LEVEL space. 
Depending on what format we want our answer to have, we can get an exact value in 
that space, or we can get a linguistic output such as “fairly high” or “rather low”. 



8 Summary 

We have thoroughly described a novel method that gives us the ability to infer 
information we do not know about the user, based on the information we already 
have. The method uses Conceptual Graphs [15] as a form of knowledge 
representation and employs Euzzy Set Theory [17] as well as the Mass Assignment 
Theory [5]. "The Forum" has been introduced as a system to which we can apply our 
mechanism to obtain advanced results. 
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Abstract. This work is part of a national project which aims at build- 
ing a tool for the analysis of microbial risks in food products. As a 
first step of this work, we propose a unified qnerying system which si- 
multaneously scans two different bases: a relational database containing 
structnred information and a conceptual graph knowledge base contain- 
ing semi-structured information. These two bases contain microbiological 
information. To achieve this, we propose a way of translating a database 
query expressed in a speci fic language into a query represented by a 
conceptual graph. This graph is projected into the base. It can also be 
generalized in order to avoid silent answers. 



1 Introduction 

Our research work is part of a national project which brings together government 
institutions and industry in order to build a tool for the analysis of microbial 
risks in food products. The first step of this project consists in gathering in 
a “database” all the information available in the scientific bibliography in mi- 
crobiology that can be useful for risk assessment. In this field of application, 
information can be either qualitative or quantitative. The information is often 
imprecise because of the com plexity of the biological processes involved. 

But another problem is that this information consists of experimental results 
in a field where knowledge is growing everyday. The integration of this infor- 
mation is a source of irregularity: similar data are often represented in a differ- 
ent way in independent bibliographical references. The term “semi-structured” 
is used to qualify this kind of information which is not really structured but 
presents similarities even if it is implicit. With semi-structured information, it is 
very difficult to determine a classical database schema in order to store all the 
useful information. Different approaches have been proposed to solve this kind 
of problem: (i) the definition of a new kind of database management system es- 
pecially designed for semi-structured data [1] ; (ii) the definition of viewpoints in 
the object model [2]; (iii) hybrid approaches combining the use of languages de- 
signed for semi-structured information representation such as XML for instance 
and object-oriented DBMS [3] or semi-structured data DBMS [4]. 
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In this paper, we propose another approach which consists in designing a 
unified querying system (called UQS) that scans two separate bases simulta- 
neously: (i) a relational database extended to the representation of fuzzy data 
containing the structured information, processed by the CFQ engine (for Con- 
textual Fuzzy Querying), (ii) a conceptual graph knowledge base containing the 
semi-structured information, processed by the SSI engine (for Semi-Structured 
Information) . 

There were many reasons for us to choose the CG model in order to repre- 
sent the semi-structured information for many reasons. First, its graph structure 
is well suited for the representation of weakly structured information. Second, 
the projection operation is available to perform database scanning. Third, the 
terminological knowledge can be useful to implement enlarged flexible query- 
ing. Fourth, different software platforms are available, allowing one to realize 
prototypes easily. 

The relationships between Relational Databases and CGs have been studied 
in several works. They generally aim at representing Relational Databases in 
terms of CGs in order to express all the knowledge of a RDBMS (schema, data, 
query, view. . . ) in a unified framework [5]. Such a modelization allows one to 
take advantage of the terminological knowledge of the CG model, in order to 
introduce flexibility in the process of querying as in [6]. Our approach is rather 
different: we don’t aim at translating a formalism into another. On the contrary, 
we propose to use uniformly two bases of a different nature; the base using the 
CG model allows us to relax the mandatory constraint of schema pre-existence 
in the relational model. 

In this paper, we study the architecture of the SSI engine. On the one hand, 
we will focus on the representation of semi-structured information and on the 
other hand, on the way of querying this information. We will assume that values 
used as selection criteria are crisp values (not fuzzy) and that the information 
stored in the CG knowledge base is precise; the “fuzzification” of this part of 
the unified querying system will be presented in a future paper. 

In the second section, we define the UQS query language and give explana- 
tions about the CFQ engine. In the third section, we explain how our application 
is modelized in terms of a CG knowledge base. In the fourth and fifth sections, 
we define respectively the notion of couple attribute/ value in terms of CGs and 
the notion of view used by the unified querying system in order to be able to scan 
the CG knowledge base. In the sixth section, we propose an algorithm allowing 
one to search for answers to a query o n the CG knowledge base. 

2 The UQS User Interface and the CFQ Engine 

2.1 UQS Querying Language 

In UQS, the queries are expressed in terms of a set of DB-projection^ attributes 
and a set of selection criteria using the form attribute/ value. These queries are 

^ in order to prevent ambiguities, we use the term of DB-projection when dealing with 
the notion used in the relational database model 
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expressed in a given view. A view is a classical concept in databases, e.g. a virtual 
table in which all the information needed by the user is brought together. 

Definition! A query Q in UQS is a set {U, Oi, . . . , a„, <a„_|_i, u„+i>, . . . , 
<aTn,Vm>} where V is the name of the view in which the query is asked; 
ai,...,a„ are the attributes of the DB-projection and <a„+i, u„+i>, . . . , 
<am,Vm> are the pairs attribute /value of selection criteria. Note that 
{ai , . . . , a„} n {a„+i , . . . , am} is not necessarily empty. 

The result of the execution of a query in UQS is a table which is composed 
of a number of columns equal to the number of DB-projection attributes. Each 
line represents a tuple composed of couples < attribute, value >, resulting from 
the query. 

Definition 2 An answer A to a query Q in UQS is a set of tuples, each of 
them of the form {<oi, ui>, . . . <a„, v„>} with oi, . . . , a„ the attributes of the 
DB-projection and wi, . . . , the values resulting from the execution of the query 
associated with each attribute of the DB-projection. 

2.2 The CFQ Engine 

An exhaustive presentation of the CFQ engine is given in [7]. Let us now present 
it briefly. For the relational database scanning, we have introduced the new 
concept of contextual fuzzy view. This notion is an extension of the classical 
concept of view, which is fuzzy because the selection criteria are expressed as 
fuzzy predicates as in [8]. This view is also contextual because it associates the 
user’s preferences expressed in the query with a category of queried information 
(defined i n the knowledge base of the information retrieval system). The con- 
textual fuzzy view has three advantages. First, the CFQ engine enlarges queries 
when they are too restrictive because of the actual content of the database: it 
provides an additional degree of flexibility not available in the previous systems. 
Second, the view engine makes an estimation of the data sought, in addition to 
the nearest information retrieved from the database. For this purpose, statisti- 
cal models have to be incorporated into the categories of queried information. 
Third, the view engine optimizes the fuzzy matching processing because the 
user’s preferences are only compared with the information associated with the 
category selected: in the previous systems, all the information stored in the 
queried tables was compared with the user’s preferences. The concept of contex- 
tual fuzzy view is implemented in a prototype written in Java language called 
CFQ (for Contextual Fuzzy Querying). The information is stored in an Oracle 
DBMS using the imprecision representation model of FSQL [8]. 

3 Representation of the Semi-structured Data from Our 
Knowledge Base in Terms of Conceptual Graphs 

The underlying application concerns information retrieved from scientific bib- 
liography in microbiology for risk assessment. More precisely, it is information 
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describing the behaviour of pathogen germs (increase, decrease or stability of 
pathogen concentration such as Listeria Monocytogenes for instance) in food 
products during food transformation processes (heating, cutting, storage, mix- 
ing, etc.). In this paper, we take the example of information about the behaviour 
of Listeria in food when it interacts wi th another family of bactery, called Bac- 
teriocin, which can be inoculated in food products in order to inactivate Listeria 
(that means to obtain a decrease in the concentration of Listeria) . 

We chose the CG model [9] as a formalism to represent semi-structured data. 
More precisely, we follow the formalization introduced in [10]. The principal 
differences with the model introduced by Sowa are that the partially ordered 
concept type set is not necessarily a lattice, and that CGs can be non-connected 
graphs (the projection of G into H is then defined in terms of the projection of 
each connected component of G into H). 

3.1 The Terminological Knowledge (The Support) 

In the modelization of the support, we represent the main part of the application 
semantics in the set of concept types. A weaker part of the application semantics 
is represented in the set of relation types. Relation types must act as “as generic 
as possible” connectors in order to be as stable as possible. The concept type set 
thus contains: biological information organized in taxonomy (substrate, germs), 
different types of data retrieved from scientific papers (bibliographical data, 
experimental data, model parameters, units of measure,...) and the different 
actions described in the papers (heating, germs interaction,...) as well as their 
results in terms of the behaviour of the germs (increase, decrease or stability of 
pathogen). Part of the concept type set is given in fig. 1. 



Universal 





Milk f~'flbbage 




Skimmed milk 



Half skimmed 
milk 



Fig. 1. A part of the concept type set 
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Remark 1 It is important to note that all the attributes of the DB tables that 
can be queried on in UQS have to be inserted in the concept type set. 

The set of relation types contains generic relation such as Agt( Action, Germ) 
(has for agent), Char (Universal, Experimental Datum) (is characterized by), 
Obj(Action, Universal) (has for object), Res(Action, Action) (has for result), 
Unit(Datum, Measure Unit) (has for unit). Some relation types may be sub- 
typed in order to obtain a stronger type control. For example, the relation type 
Unit is sub-typed in Temperature Unit, Time Unit and Concentration Unit . 

The set of individual markers is used as is usual in the CG model to represent 
the instances of concept types. Even if our goal is to represent complex numeric 
values (such as fuzzy intervals for instance) , the only markers we use are symbolic 
ones (individuals). In this first step of our work, the matching of numeric values is 
limited to exact matching. In other words, a temperature of 20 is represented by 
the concept vertex [Temperature : twenty], “twenty” being an individual marker 
(in our examples, fo r a better readability, we use the numerical notation) . The 
extension to numerical fuzzy values will be studied in a future work. 

Remark 2 We consider that the intersection of the set of the concept type labels 
and the set of the individual marker labels is empty. If Nisin is a concept type, 
sub-type of the concept type Bacteriocin, we don’t accept an individual marker 
labeled “Nisin”. 

3.2 Assertional Knowledge (The Graphs) 

In our knowledge base, each graph represents an elementary piece of information. 
For instance, all the information relative to a biological experience (conditions of 
the experience such as temperature, pH, nature of the substrate, and results of 
the experience...) belongs to the same graph. Of course, a scientific publication 
can be expressed by several graphs. In fig. 2 and 3, we give the two graphs 
which correspond to the two experiences presented in the following biological 
resu It taken from [11]: “at a temperature of 37 °G, nisin at a concentration 
of 50 U/ml is very efficient (reduction after 2 hours) against Listeria Scott A 
(initial concentration of 6.1(fi CFU/ml) in skimmed milk and very inefficient 
(stability) in half-skimmed milk”. 

We chose to represent the graphs in our knowledge base with connected, 
possibly cyclic CGs. 

Definitions The knowledge base KB = {Gi,...,Gp} containing the semi- 
structured knowledge of our system is a set of connected, possibly cyclic GGs. 

4 An Analogy of the Notion of Attribnte/ Value in Terms 
of Conceptual Graphs 

The notion of pair attribute/ value is used in two different steps of the search for 
answers in a knowledge base composed of GGs: 
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Fig. 2. An example of semi-structured information (experience 1) 




Fig. 3. A second example of semi-structured information (experience 2, only subpart 
different from experience 1 is shown) 



1. during the selection operation, since we have to select information in which 
an attribute has a given value; 

2. during the DB-projection, since we have to associate a value with each at- 
tribute of the DB-projection to build a tuple of the answer. 

4.1 Simulation of a Selection in Terms of Conceptual Graphs 

Assume we want to select a set of CGs from KB which satisfy a single <a,v> 
pair of selection criteria. 

Definition 4 The concept vertex of selection associated with a pair of selection 
criteria <a,v>, noted sel-vertex{<a,v>) is a single concept vertex labeled: 

— [v : *] if V is a sub-type of a in the concept type set; 

— [a : v] if V is not a sub-type of a in the concept type set. 

For example, if the pair of selection criteria is <Bacteriocin, Nisin>, the 
concept vertex of selection is [Nisin : *] because Nisin is a sub-type of Bacteriocin. 
If the pair of selection criteria is <Temperature, 20>, the concept vertex of 
selection is [Temperature : 20] because 20 is not a sub- type of Temperature. 

Definition 5 A CG G satisfies a selection criteria <a,v> iff the graph limited 
to the single concept vertex sel-vertex{<a,v>) can be projected into G. 
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4.2 Simulation of a DB-Projection in Terms of Conceptual Graphs 

Assume we want to compute a DB-projection in a CG, with an attribute of 
projection a. 

Definition 6 The concept vertex of DB-projection associated with an attribute 
of DB-projection a, noted pro j -vertex{a) , is a single concept vertex labeled [a : *]. 

Definition 7 The result of the DB-projection of an attribute a on a graph G of 
KB in which pro j-vertex{a) can be projected, is the pair <a,v> built as follows. 
Let c be the concept vertex image of pyroj-vertex{a) through the projection: 

— v is the referent of c if c is an individual concept vertex and type{c) = a; 

— v is the type of c if c is a generic concept with type{c) < a; 

— v is the pair <type{c),ref{c)> if c is an individual concept vertex and 
type{c) < a. 

For example, if the DB-projection attribute is pH and a CG G in the knowl- 
edge base is [pH : 7], we build the concept vertex of DB-projection [pH : *] 
which can be projected into G. The result of the DB-projection is <pH, 7>. 
If the DB-projection attribute is Bacteriocin and a CG G' in the knowledge 
base is [Nisin : *], we build the concept vertex of DB-projection [Bacteriocin : *] 
which can be projected into G'. The result of the DB-projection is <Bacteriocin, 
Nisin>. If the DB-projection at tribute is Bacteriocin and a CG G” in the knowl- 
edge base is [Nisin : we build the concept vertex of DB-projection [Bac- 

teriocin : *] which can be projected into G" . The result of the DB-projection is 
<Bacteriocin, Nisin(^j^212)>. 

Remark 3 In the last example, a pair of values is returned for a single attribute. 
Even if such a result is not atomic, in contradiction with the usual DB-projection 
rule, we chose this solution because both values can bring meaningful information 
to the user. It is not a major drawback because we consider the DB-projection 
operation as a final one. 

5 An Analogy of the Notion of View in Terms 
of Conceptual Graphs 

In section 2.1, we saw that the query in UQS is expressed in a view in terms of 
a set of projection attributes and a set of pairs of selection criteria <attribute, 
value>. We have seen that the notion of view is one of the central notions of 
our system. In terms of relational database, a view is a virtual table built by 
means of a query. This query contains attributes used in a “selection” purpose 
(the attributes appearing in the “where” clause) and attributes used in a “DB-p 
rejection” purpose (the attributes appearing in the “select” clause). All these 
attributes can be taken from different tables of the database. It is not easy to 
express this notion of view in terms of CGs. What we need is a process allowing 
us to build a “virtual table” by querying the graphs of the base. 
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Naive approach. In order to extract information from the CGs, we need to 
find CGs which satisfy all the selection criteria, then get the values associated 
with all the DB-projection attributes. 

Let Q = {V, oi, . . . , a„, <a„+i, u„+i>, . . . , <am, Vm>} be the query we want 
to answer on KB. 

A naive method of finding such CGs would consist in filtering the knowledge 
base in three steps. 

1. Building a disconnected CG query composed of: 

— m single concept vertices corresponding to proj-vertex{ai) for each pro- 
jection attribute Ui of Q; 

— m — n single concept vertices corresponding to sel-vertex{<aj,Vj>) for 
each selection criterium <aj,Vj> of Q. 

2. Projecting the query into each CG of KB. We obtain a set of graphs which 
satisfy the selection criteria, and which contain each attribute of the DB- 
projection. 

3. Building, for each of these graphs, a tuple {<ai,ui>, . . . , <a„,u„>} of the 
answer by extracting the result of the DB-projection of each projection at- 
tribute as presented in definition 7. 

A drawback to this simple mechanism is its important lack of semantics. 
Whereas in the relational database view, the virtual table contains meaningful 
tuples (because the manager of the database implements pertinent queries) , with 
the naive method, we might get absurd results. For example, if we search for the 
temperature and the pH of an experience, and if a CG contains information about 
such an experience but also contains a reference to the external temperature, 
the projection of two single concept v ces will return the external temperature 
and the pH of the experience as a result, which is nonsense. In order to avoid 
such a “noise” in the answer, we propose to use “schemata” which are specific 
graphs containing all the concepts that belong to the considered view. These 
schemata link the attributes with semantics. For example, if we know that the 
temperature and the pH are relative to the condition of the same biological 
experience, we can have a schema which is a GG containing two generic concept 
V ertices [Temperature : *] and [pH : *], linked to another concept [Experience : *] 
with appropriate relation vertices. Then, if we find a projection of such a schema 
graph, we know that the values of the attributes temperature and pH belong 
to the same experience... Note that we assume that all the schema graphs for 
each view of the system are given by the manager of the knowledge base. This 
is a good way to ensure that all the schemata are meaningful and then that 
the results obtained by such a method are valid. So far, we have not worked 
on the automatic extraction of schemata, which could be done by searching 
for a common generalization of the graphs of the knowledge base for example. 
Moreover, we cannot use the schemata stored in the relational database as in 
[6[ because (i) the information stored in both databases is different and (ii) we 
do not want to impose a unique schema preexistence for the semi-structured 
information. 
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Definition 8 Let V he a view, let ai, . . . , am be the m attributes belonging to 
the DB-view (in both the seleet and the where elauses). We will say that S{V) = 
{S'!, . . . , Sr} is the set of CGs schemata associated with the view V if each CG 
Sr ofS{V) : 

— is an acyclic and connected GG; 

— contains exactly one generic concept vertex of the concept type (or a subtype) 
corresponding to the attribute Oi in the concept type set for each attribute Oi 
belonging to the view; 

— links the attributes of the view with meaning in the considered view ( this last 
point obviously depends on a human expert). 

Definition Q In a schema graph Si, the unique concept vertex corresponding to 
the attribute Oi of the view is noted vertex(ai). Its concept type is ai or one of 
its subtypes (as we saw previously, each attribute belonging to a view has to be 
inserted in the concept type set). 

In this first version of our work, we voluntarily limit the schemata to con- 
nected and acyclic CGs - connected because it seems natural that all the at- 
tributes belonging to a same view should be linked, and acyclic for complexity 
reasons: as we will see in the next section, our searching algorithm is based on the 
projection operation, which is NP-complete, but polynomial in some particular 
cases such as that of a tree projected into a graph [12]. 

For example, we define the view Bacteriocin Interaction with the following at- 
tributes: Substrate, Bacteriocin, Duration, Temperature, Pathogen Germ, Expe. 
Result. The schema presented in fig. 4 is associated with this view. 




Fig. 4. A schema associated with the view Bacteriocin Interaction (concept vertices 
belonging to the view are framed in bold) 



6 Querying the Knowledge Base 

Let Q = {V,ai, . . . , a„, <o„+i, u„+i>, . . . , <am,Vm>} be a query, and S(V) = 
{^i, . . . , Sr} be the set of schema graphs associated with view V. The first version 
of the querying algorithm is the following: 
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answer ^ 0 
for each S € S(V) do 
query ^ S 

for each vertex(ai) € set of vertices vertex{an+i), ■ ■ ■ ,vertex{am) do 
restrict the label of the generic concept [a^ : *] of query to 

sel-vertex{<ai, Vi>) 



done 

project query graph into each graph of KB 
for each projection II do 

build the tuple {<oi, 'Ui>, . . . , <a„, Vn>} from the result of the 
DB-projection of the vertices vertex{ai), . . . ,vertex{an) 
add this tuple to answer 



done 



done 



Since the query graphs are acyclic, the projection algorithm we use is the 
polynomial algorithm for the projection of a tree into a graph proposed in [12]. 
For example, we consider the following query: 

Q = {View = Bacteriocininteraction, Substrate, Bacteriocin^ Duration, 
Temperature, PathogenGerm, ExperimentalResult, <Bacteriocin, Nisin>, 
<Duration, 2>, <Temperature, 37>, <PathogenGerm, Listeria>} 

The query graph projected in the KB is given in fig. 5. The tuples returned 
by the algorithm (after projection on both examples shown in fig. 2 and 3) are 
the following: 



Substrate 


Bacteriocin 


Duration 


Temperature 


Pathogen 

Germ 


Experimental 

Result 


Skimmed milk 


Nisin 


2 


37 


Listeria 
Scott A 


Reduction 


Half-skimmed 

milk 


Nisin 


2 


37 


Listeria 
Scott A 


Stability 




Fig. 5. An example of query graph projected in the knowledge base 
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This first version allows us to build an answer to the query as a virtual table 
based on the projection of partially instantiated schemata into the graphs of the 
knowledge base. But with this version, we obviously limit the result to an exact 
projection of the different schema graphs. Since the structure of the graphs of 
S{V) and the structure of the graphs of KB are not necessarily the same, it is 
possible to get “silence” in the answer (the information allowing an answer is 
present in KB but the syste m cannot find it). For example, if the base contains 
the information stored in the CG shown in fig. 6 and if the query is that given 
in fig. 5, the projection is impossible. 




Fig. 6. A third example of semi-structured information (experience 3, only subpart 
different from experience 1 is shown) 



This is a well-known problem with the search for projections of a query into 
a knowledge base, which is a Boolean operation (a CG can be projected into 
another one, or cannot). To solve this problem, it is possible to use a specific 
projection operation (such as a-f3 projection in [13] or “partial projection” in 
[14]). Another possibility is the generalization of the query graph as in [15]. 

In order to reduce the risk of “silence” in the answer, we propose to generalize 
the query graph by splitting it and by removing some parts of the graph, when 
no answer is found. 

Generalization of a query graph In the following, we describe intuitively the 
main steps of the generalization process of a query graph query. During the 
first step, if the DB-view contains n attributes, we build at most n generalized 
query graphs for each schema graph. For example if the n attributes of the 
view are oi, 02 , . . . , a„, we build n query graphs queryi,query 2 , . . . , query n from 
the query graph query, query i is a disconnected graph built by removing the 
vertices located on branches starting with vertexifli) until the last relation vertex 
incident to shortest-path(ai) or until the end of the branch if the branch is not 
incident to shortest-path(ai) . We define shortest-path(ai) as the shortest path 
between all the attributes of query excluding a^. vertex{ai) is not removed: the 
graph becomes non-connected. Note that if vertexifli) is located on shortest- 
path{ai), querpi is not built (since it is isomorphic to query). 

During the second step, all the generalized queries queryi,query 2 , . . . , query n 
are generalized again, following the same process, and so on, until the generalized 
query results in a disconnected graph composed of single generic concept vertices. 
This final generalization corresponds to the naive approach presented in section 
5. Of course, the higher the generalization of the query, the lower the level of 
confidence. 
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Note that 2” is the upper bound of the number of generalized graphs if the 
view contains n distinct attributes. It is of course a major drawback of the mech- 
anism used to generalize queries, but in actual practice, we will reduce the com- 
binatorial explosion in limiting the generalization process to two disconnected 
vertices, so at most n.{n— l)/2 generalized queries are generated. Moreover, it is 
possible to store the generalizations of all the schema graphs, in order to avoid 
computing these generaliz ations every time. 

In order to take into account the confidence level in the answers, we introduce 
an extension of the notion of answer. 

Definition 10 An answer A to a query Q is a set of tuples, each of them of 
the form {<oi, ui, 6i>, . . . , <a„, 6„>} with oi,...,a„ the attributes of the 

DB-projection, ui,...,u„ the values resulting from the execution of the query 
associated with each attribute of the DB-projection and 6i, . . . , Boolean values, 
bi being set to false if vertex{ai) is the image of an isolated vertex of one of the 
generalized query graphs. 

For example, if we consider the query given in fig. 5, the generalized query 
graph obtained when the vertex Substrate has been isolated is given in fig. 7. 
This generalized graph can be projected into the CG of fig. 6. The final result 
delivered by this method for the query of fig. 5 is given into the following table: 



Substrate 


Bacteriocin 


Duration 


Temp. 


Pathogen 

Germ 


Experimental Re- 
sult 


Skimmed milk 


Nisin 


2 


37 


Listeria Scott 
A 


Reduction 


Half-skimmed 

milk 


Nisin 


2 


37 


Listeria Scott 
A 


Stability 


Cabbage 


Nisin 


2 


37 


Listeria Scott 
A 


Reduction 



Remark 4 The bold font used for the value cabbage means that this value has 
been obtained by the projection of an isolated vertex. The reliability of this value 
is not ensured. 



7 Conclusion and Perspectives 

In this article, we presented in this paper the very first step of an important 
project concerning the storage and retrieval of semi-structured information. The 
underlying application concerns the domain of microbial risks in food products. 
In this first paper, we focused on the unified interrogation of a relational database 
and a knowledge base represented in terms of CGs. We implemented the part of 
this work concerning the GGs as a prototype built on the GoGITo platform [16]. 
Our very next work shall focus on three different points: 





Towards a Unified Querying System of Imprecise Data Using Fuzzy View 219 




Fig. 7. The generalized graph from the query graph of fig. 6 obtained after isolation 
of the vertex Snbstrate. 



— the extension of the CG model we use: due to the nature of the information 
we have to represent, it is important to extend the referent we use to numeric 
values, sets, fuzzy sets... Previous works have already studied the extension 
of the CG model to represent fuzzy information [17,18]; 

— the extension of the querying language of both the database and the knowl- 
edge base; 

— the testing of our prototype on an entire knowledge base, which has to be cre- 
ated by the group of microbiologist experts working on our national project. 

In a more distant future, we will have to think about our system being 
used by non-specialists of the CG model. An important work on the interfacing 
of our system has to be done. First, during the knowledge acquisition stage: 
for example, it is important to enable biologists to build the knowledge base 
with CGs which have to be completed. Second, during the querying stage, by 
displaying additional information extracted from CGs when the answer is built 
by using information found in the knowledge base. 
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Abstract. This paper gives the extensional semantics definitions associated 

with simple conceptual graphs^ (CG) under a simple paradigm closely related 
to relational database (RDB) theory. We believe that by providing such 
definitions, the implementation of a CG system using this paradigm would he 
greatly facilitated. Considering the nature of CGs, existentially quantified 
assertions, the use of a RDB at the implementation level becomes attractive for 
applications with huge volume of data (like with deductive databases). 
Furthermore, RDB theory provides years of experience, stability and on-going 
support of the many tools that the industry offers today. These advantages 
become prerequisite when choosing a technological platform for a large scale 
industrial project. This paper provides a formal basis for understanding the 
deep semantics of the CG notation, including the compositional semantics of 
the canonical formation operators. It also provides a simple procedure that 
performs model checking of graphs in a CG system. 



1 Introduction 

A conceptual graph is mainly composed of existentially quantified concepts 
representing objects of some type. When translating a conceptual graph into a clause 
form as needed by a theorem prover, the skolemization of these variables introduces 
constants that represent these objects. This step, needed by a theorem prover whose 
unification procedure is based on universally quantified variables, prevents any 
reasoning on the possible cross-referencing of these variables. For applications where 
these cross-references must be established, for instance when information is exchanged 
between different knowledge bases or agent systems, the underlying semantics of these 
systems should not use skolemization to introduce constants. Rather, we advocate the 
use of surrogates (also called witnesses or place-holders in different theories). A 
surrogate is a variable used as a constant by the unification procedure, but which 
represents some individual that may not be known at the time the inference engine 

^ Simple conceptual graphs are those that do not bare nested graphs and do not use the 
negation operator. 
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requires it. However, knowing that the individual object exists may be sufficient to 
provide interpretations over the formulae that use it, independently of its actual value 
(which can be determined at a later stage). Consequently, it provides for more 
flexibility when knowledge acquisition is done incrementally, as is the case in practical 
applications; the truth- value of a formula can be computed in spite of this lack of 
information. 

In the CG literature, as surprising as it may seem, there are only a few 
attempts to define the extensional semantics of the CG formalism. Among them, let us 
cite the two most relevant work to this paper [1,2]. In [1], a brief correspondence 
between what is normally found in first order logic theory and the CG theory is 
sketched. Based on this correspondence, this paper provides the full definitions 
required to define the extensional semantics of the CG theory. In order to do so, 
necessary extensions proposed in this paper include: 1) a redefinition of the 5 operator 
which now defines the extension set of a conceptual graph and is not limited to a type, 

2) the introduction of variables into the definitions so that interpretations of formulae 
(conceptual graphs) could be defined over sets of existentially quantified concepts, and 

3) the definition of the compositional semantics of the canonical formation operators. 

In [2], the author presents the more thorough description of the CG 
extensional semantics of the existing literature. She proposes a general framework that 
links a grammar to its interpretation onto a set of individual objects. This is certainly a 
very general theory applicable in many cases, including the conceptual graph theory. 
What we propose in this paper could therefore be viewed as an instantiation of what is 
proposed in [2] for a special case where the grammar is known described in [3], and 
where the compositional semantics of the formation operators is given according to a 
RDB paradigm. Again, the choice of this paradigm as basis for providing these 
definitions aim at facilitating the mapping between theory and tools, which reduce the 
time to market constraint imposed on real-life industrial problems, and meets the usual 
requirements of software development in industrial settings: available expertise, 
stability of tool, support during development, installation and maintenance cycles, 
sustainability, efficiency, etc. 

Section 2 provides the definitions of a canon of a CG system. ^ Section 3 
describes the extensional semantics of the notation, including the compositional 
semantics of the basic canonical formation operators. Section 4 discusses how this 
framework can be extended to include semantic constraints. Section 5 concludes. 



2 The Canon of a CG System 

All graphs in a CG-based system are derived from the canon of the system. A canon 
contains the ontological elements that are used to describe knowledge in this domain. 
A canon is described as a tuple <T,1,::,B>. T is a set of concept and relation types. I is 
a set of referents representing individual objects, either directly identifiable (by 



^ We chose to make this paper self-contained, and therefore, to include definitions that are 
well-known to CGers [3]. 
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constants) or not (then identifiable by variables). The conformity relation :: encodes 
whether a particular individual, represented by a referent i e I, can be interpreted as 
conformant to a particular type t e T, written t::i. B is called the canonical basis of the 
CG system; it encodes the signature of every relation type. The signature of a relation 
type t of arity j is a tuple of j concept types, each one indicating the concept type that 
the i”" component of a relation of type t must conform to (for i e [l,j]). 

This section presents the formal definitions associated with each of the four 
basic elements of the description of a canon <T,I,::,B> with respect to a fixed set of 
objects O called the domain of the system. Some definitions come from [3]; others 
were added or adapted in order to complete the definition of the extensional semantics 
of the CG theory. 



2.1 The Set of Types T 

Here we present a single set of types which encompasses concept types and relation 
types (of different arity). We make this choice in order to simplify information 
interchange between systems wishing to establish a common ontology in order to 
communicate. Together, the elements of T form a partial order (T,<) with one single 
top element T, the universal type (the most general type), and a single bottom element 
_L, the absurd type (the most specific type). Concept types and relation types are 
pairwise incomparable; relation types of different arity are pairwise incomparable. 
Formally, we introduce the set of types as follows. 

Definition 1. Let (T^,<^) be a partially ordered set of concept types, and (T_.,<_.) 
be a partially ordered set of relation types with arity function arity: T^\{T,_L} 

^ N*, such that: 

1.1) T,nT,= {T,l); 

1 .2) <c T and _L <j T; 

1.3) for all t G Tc\{T,_L}: ± <(, t <(. T, and for all t g Tr\{T,_L): _L <j t <j T; 

1.4) for all ti,t2G Tr\{T,_L}: arity(ti) ^ arity(t2) implies ti and t2 are 
incomparable, i.e., neither tj <j t2 nor t2 <r ti holds. 

The set of concept types is defined by T = T^ U T^, and the partial ordering 
< on T is defined as the union of < and < . 

c r 



2.2 The Set of Referents I 

Referents represent objects of the reality being modeled, or artifacts found in computer 
models needed to describe objects of the reality. A referent is an individual marker (a 
constant value) when the object that it represents is known (identifiable); or it is a 
variable when the object that it represents is known to exist but has not yet been 
identified. Let M be the set of individual markers; let X be the set of variables. 

In the traditional CG literature, a star symbol is often used in the referent field 
to denote a concept that represents an existing but unknown (unidentified) object. 
Alternatively, in that case, the whole referent field may be omitted altogether. Such a 
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concept is called a generic concept and is existentially quantified using some variable 
representing the object. This mechanism should be seen as a shortcut notation provided 
to the knowledge modeler only for ease of expression. In fact, even when not directly 
inputted by the knowledge modeler, the existentially quantified variable does exist; it 
identifies the object that the generic concept represents. 

Finally, in order to represent the whole set of individuals complying with the 
concept type of some concept, the universal quantifier V may be used as a referent. 
The elements of (M u X) are called existential referents', the V symbol is called the 
universal referent. 

Definition 2. Let M be a set of individual markers with V g M, and let X be a 
set of variables with V g X, such that M n X = 0 . Then the set of referents I 
is defined byI = MuXu {V}. 

However, the universal referent is used as a short-cut notation to represent the 
distributive set of all individuals conformant to a certain type. For instance, we could 
use the concept: [EMPLOYEE: V] to represent all the employees that we know of, i.e., 
[EMPLOYEE: #John], [EMPLOYEE: #Joe] and [EMPLOYEE: #Peter], which could 
also be represented using a typed distributive set: [EMPLOYEE: Dist(#John, #Joe, 
#Peterj] as defined in [3]. Consequently, we do not need and will not include the 
universal referent in the definitions below. 

The relationship between the referents in I and the domain O of the CG system 
is given by an allocation function a, i.e., a mapping a: I\(V) O that maps all 
existential referents onto objects in O. In this paper, we assume a to be fixed. 

Assumption 1. {Unique Name Assumption) Eor all individual markers i,,!^ e 
M, ij f implies a(ij) OcCi^), i.e., different individual markers always 
represent different objects. 

Assumption 2. (Global Coreference Assumption) Since all variables appear in 
I and are hence accessible to any embedded context, and since a is defined as 
a function, each variable is seen as a global referent to the object that it 
represents (i.e., accessible in any context). 

Due to the unique name assumption, the set of individual markers M can be seen as 
a subset of O in the following sense: the set a(M) = [a(m) | meM) is a subset of O 
and there exists a bijection from M into a(M). In other words, we might use the object 
o with a(m)=o as referent instead of the individual marker me M. Note, however, that 
the unique name assumption does not apply to the set of variables X, i.e., a is not 
injective on X. Also, in environments where a(x) is known to exist but can not be 
resolved to a particular (identifiable) element of O (e.g., in cases of partial knowledge), 
X cannot be seen as a subset of O analogously to M, and a referent xe X may not be 

resolved by its corresponding object a(x)G O.^ 

^ Referent resolution may not be needed in order for different agents to communicate if the 
information about the existence of some object is all that is required for the communication 
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2.3 The Canonical Basis B 

The canonical basis of a CG system contains basic graphs, called canonical graphs, 
from which any other graph is derived through the application of the canonical 
formation operators (see [3] for an introduction on each of these operators). The 
canonical basis B contains one canonical graph for each relation type in T_^; this graph 
is a model of how any relation of that type must be used. In other words, it constrains 
(restricts) the concept types of the concepts that such a relation may link. 

Canonical graphs are sometimes called signature graphs. Through the 
projection operator, the CG system makes sure that any relation complies with its 
signature graph. That is, there must always be a projection of the signature graph of a 
relation type onto a relation of that type in any conceptual graph that uses it. By the 
definition of the projection operation, each concept linked to the relation must be a 
specialization of the corresponding concept type appearing in its signature graph. 
Therefore, the signature graph of a relation type encodes a constraint on the use of 
relations of that type. 

Definition 3. Let s denote the signature function that maps each relation type t 
G T^\{T,_L) onto its signature s(t) e (T^\{±})“‘’'®. Let comp(i,t) denote the i* 
component of the signature of the relation type t, i.e., comp(i,t) = t if i < 
arity(t) and s(t)= (tj, ... , t, ... , t^,y,„). For i > arity(t), we define comp(i,t) = _L, 
meaning that comp(i,t) is not defined for any i greater than arity(t). The 
signature function s is conformed to the set of relational types T^ iff for all t^t^ 

G T\{T,_L}, if tj < t^, then for all 1 < i < arity(tj), comp(i,t,) < comp(i,tj).^ For 
a signature function s that is conformed to the set of relational types T_^, the 
canonical basis B is defined by B = {<t,s(t)> | t g T^\{T,_L) ). 

In order to proceed with the formalization, we first recall the formal definition of a 
simple conceptual graph. A simple conceptual graph over a set of types T is a labeled 
bipartite graph of the form u = (C, R, E, 1), where C is the set of concept nodes and R 
is the set of relation nodes, E c CxR is the edge relation, and 1 is the labeling of u: 
each concept node cg C is labeled with a tuple 1(c) = (type(c), ref(c)) g T^ X I, called 
the type and the referent of c (where type and ref are two functions such that type: C 
^ T^ and ref: C ^ I); each relation node r g R is labeled with a relation type type(r) g 
Tr; finally, all edges that are incident to the same relation node r g R are labeled with 
an integer number g [l,n], where n = the arity of r, in such a way that the set of all 
labels incident to the same relation r = {i, Vi g [l,n]} By definition, all edges whose 
label is smaller than n are called in-coming edges (pointed toward the relation), the 
edge whose label is equal to n is called out-going edge (pointed away from the 
relation). A concept node c is called the j neighbor of a relation node r of type t iff c is 
linked to r by an edge such that j = l(c,r). The referent of r in u is denoted by 
<ij,i 2 ,...i^,y„)>, where i^ is the referent of j* neighbor of r in u. The following 

to proceed. In that case, referent resolution can be postponed to a later stage. This improves 
interoperability between different knowledge-based systems. 

^ Note that tj < t^ implies arity(tj) = arityft^) (see Definition 1). 
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assumption formalizes the idea that each simple CG that is said to be derivable from a 
canon <T, I, B> should conform to the signature given by B. 

Assumption 3. {Signature Compliance Assumption) Let u be a simple CG. 

For all r in u and for t = type(r), 1 < i < arity(t), type(c) < comp(i,t), where c is 
the i”" neighbor of r in u. 



2.4 The Conformity Relation :: 

The conformity relation establishes the basis for doing model checking on simple CGs 
over a set of types T. Recall that we consider a fixed domain O and a fixed allocation 
function a. In what follows, we also assume that there exists a fixed interpretation (3 of 
the types in T with regard to O, i.e., each concept type t e is interpreted by a subset 
(3(t) of O, and each relation type t e with arity(t)=n is interpreted by an n-ary 
relation over O, (3(t) c O”. Based on this fixed interpretation (3, we can now define the 
extensional semantics of CGs over T. In this section, we first introduce the extensional 
semantics of basic CGs; in Section 3, we use these definitions to present the semantics 
of canonically derived CGs obtained from basic CGs by the iterative application of the 
formation operators. The conformity relation :: of the canon corresponds to the 
characteristic function of the interpretation (3, and we get Definition 4, and 
consequently. Assumption 4. 

Definition 4. The conformity relation :: maps types and (tuples of) referents 
onto {True,False} according to the interpretation [3. For all concept types te T^ 
and all referents iel\{ V), t:;i = True iff a(i)e (3(t). For all relation types teT_. 
with arity(t)=n and all tuples (ij, ..., i_^) e (I\{V})“, t::(ij, ..., ij = True iff (a(ij), 

..., a(ij)e[3(t). There is no i e I (tuple (ij, ..., ijel”) such that _L::i (_L::(ij, ..., 
iJ) holds. For all concept types teT^, t::V = True iff [3(t);*0.5 

Assumption 4. (Type Subsumption Assumption) For all concept types tj < t^, 
and for all referents iel, tjid implies t 2 ::i. For all relation types tj and t^ with tj 
< tj and arity(tj) = n, and for all tuples (i,, ..., i„)Gl”, tj;:(ij, ..., iJ implies t 2 ::(ij. 



3 The Extensional Semantics of Derived Conceptual Graphs 

The purpose of this section is to introduce the compositional semantics of the 
canonical formation operators so that simple model checking procedures can be 
developed according to the specificity of the CG-based model. Of course we could 
choose to use standard methods on FOL formulae produced by the application of the (|) 

^ Once again the V symbol is used only as a shortcut notation to represent the denotation set of 
type t. Hence the interpretation of concept [t:V] may be false if this set is empty. 
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operator on the graph for which model checking is required. However, since 
conceptual graphs are mainly existentially quantified assertions, model checking can 
be achieved according to the semantics of the formation operators themselves, yielding 
a much more direct and simpler procedure totally adapted to the nature of CG systems. 
Close to relational database theory, a model checking procedure adapted to the nature 
of the CG operators would facilitate rapid prototyping and subsequently easy 
implementation of CG-based systems. Having formally defined the canon of a 
CG-system with regard to a given domain O, an allocation function a, and an 
interpretation function (3, we can now define the extensional semantics of CGs derived 
from the canon. A derived CG is obtained by the iterative application of the canonical 
formation operators [3] on basic conceptual graphs (i.e., graphs having a single 
concept node or a single relation node together with its connected concept nodes). 



3.1 The Interpretation of Basic CGs 

We first recall the definition of the (|) operator and introduce the notion of support of a 
conceptual graph. Intuitively, the support 5(u) of a CG u denotes the set of all tuples of 
referents allocated by a in such a way that u evaluates to True. We can use the support 
to characterize the CGs that are valid with respect to the given interpretation. 

Definition 5. Formally, we define value(u) := (5(u) • 0), where the support 
set of graph u is defined below. 

Definition 6. Let u be a single concept graph, i.e., a basic CG of the form [t:i] 
with t e and i e I. Then t is used as a unary predicate and we define: 

6.1) if i G M, then (|)(u) t(i), where i is used as a constant; and 5(u) := 

{<i>} if a(i)G p(t); otherwise, 5(u) := 0; 

6.2) if i G X, then (|)(u) 3i t(i), where i is used as an existentially quantified 
variable; and 5(u) := {<j> | i g I\{ V! and a(i)G B(t)l; 

6.3) ifi = V,^(u)^(Aj,p,„^([t:j])).6 

And in each case, value(u) can be computed based on Definition 5. 

Note that for a single concept graph u=[t:i] the support 5(u) coincides with the 
denotation set of t as introduced in [3]. 

Definition 7. Let u be a basic CG composed of a single conceptual relation r 
of type t G Tj\{T,_L}, linked to arity(t) concepts according to s(t), its signature 
graph. For Jg [l,arity(t)], let concept(j,r,u) denote the j concept linked to r in 
u. Then t is used as an arity(t)-ary predicate and we define: (|)(u) [i arityo)] 

(|)(concept(j,r,u)) a t(ij,i 2 ,...i^,y,„), where ij denotes the referent of 
conceptO,r,u); and 5(u) := {<ii,i 2 ,...u„,)> I (a(ii), aCi^), ..., a(i^,y„,)) g p(t)}. 

And value(u) can be computed according to Definition 5. 

^ The universal referent being only a short-cut notation, 6.3 expands a universally quantified 
concept to its complete representation, so that 6.1 and 6.2 can be applied. 
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3.2 The Compositional Semantics of the CG Formation Operators 

Canonically derived conceptual graphs are those obtained from the canon by the 
iterative application of the canonical formation operators onto basic conceptual graphs. 
This section shows how to compute the set of valid interpretations, the support, of such 

graphs.^ Therefore, we present the compositional semantics of each operator. But 
before, we need to introduce useful definitions and assumptions which will help us 
define a precise mapping between the concepts of a graph u and the corresponding 
referents in the tuples of 5(u), its support. 

Assumption 5. (Normal Form Assumption) A graph u = <C,R,E,1> is said to 
be in normal form iff for any pair of distinct concepts c^,c^ e C, ref(c,) • 
ref(c 2 ). Then we say that NF(u) holds. ^ 

Definition 8. Based on some total ordering function, like a lexicographic 
comparison function between labels <j, for any pair of distinct concepts c^,c^ e 
C from a graph u in normal form, either l(Cj)<j l(Cj) or 1(C2) l(Cj) holds. We 
can rank all concepts in a graph u according to this function. For any concept 
Cj in u, let us define pred(Cj,u) = {c^ in C | Cj^c^ and c <j cj. Then it is easy to 
define an index function as index(Cj,u) = | pred(Cj,u) [ + 1. And similarly, we 
can define index'(i,u) = {c e C such that index(Cj,u) = i}. If NF(u) holds, 

I index'‘(i,u) | = 1. 

Now we can establish a mapping between concepts in a normal form graph u and the 
elements (referents) of each tuple in 5(u), its support. This mapping can be described 
as follows: for any I, the j element of any tuple <ij,i 2 ,...i|^|> in 5(u) (with j g 
[1, 1 C I ]), we have: j = index(c,,u), that is, according to the valid interpretation of u 
represented by <ij,i 2 ,...i|<,|>, f is the referent of concept Cj in u for that interpretation. 
We also define 5(w) the support of a graph w, a subgraph of u, in terms of 5(u). 

Definition 9. If w is a subgraph of u = <C,R,F,1> and w = <C',R',F',1'> with C 
c C, R' c R, F' c F, r c 1, m = | C | and n = | C | , its support with regard to 
the support of u, written 5Xw,u), is defined by construction. We have 5Xw,u) 

:= {<i^ , i|^ , ..., i^ > where k^ g [l,n] for all p g [l,m], and where for each i,^ 
there is a j such tRat i^ = i, in <ij, i^, . . ., i,,> g 5(u), with k *k if p*q for all p,q 
G [l,m]}. 

Having established mapping functions between concepts of a graph u and elements of 
its support 5(u), and having defined the support of a subgraph w in terms of the 

n 

' The canonical formation operators were described in [1], together with a proof of 
completeness and soundness. Therefore, only their compositional semantics will be 
described in this paper. 

^ The normal form of a conceptual graph was originally defined in [7]. Here we assume that all 
acquired graphs will be expressed under that form, thus we formulate an assumption to that 
effect. Of course, in practice, there could be actions to be taken to ensure that it is so. 
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support of the graph u from which w is a subgraph, we can now proceed with the 
definition of canonical formation operators. 

The Copy Operator. Let u = <C,,Rj,Ej,lj> and v = <C 2 ,R 2 ,E 2 ,l 2 >. If v is a copy of u, 
written v := copy(u), then: 



Icj 


= Ic.j 


and Cj n Cj 


= 0 


IrJ 


= Ir. 


and Rj n Rj 


= 0 



- 3mj: C, ^ C 2 a bijective function mapping the concepts of C, onto those of 

- 3m,: R, ^ R, a bijective function mapping the concepts of R, onto those of R, 

- E^ := {(mj(c),m,(r)) | (c,r) e EJ 

- l,(mXc)) := l,(c) Vc e C, 

- Um/r)) := li(r) Vr e Rj 

- l,(mj(c),m,(r)) := lj(c,r) V(c,r) g E, 

- NF(u) ^ NF(v) because | | = I C, | , and 1^ = Ij on mapped concepts of u and v 

- (|)(v) := (|)(u) since v is isomorphic to u and 1^ = Ij on mapped concepts and relations 
of u and v, and consequently, 5(v) := 5(u) and value(v) := value(u). 

The Simplify Operator. Let u = <C„Rj,E„lj> and v = <C 2 ,R 2 ,E 2 ,l 2 >. If v is obtained 
from u by deleting a redundant^ relation r of type t from u (r g Rj), written v := 
simplify(r,u), then: 

- C,:=C. 

- R.:=R.\{r} 

- E,:=E,\{(c,r) I c g C J 

- l,(c) := \(c) Vc G C. 

- l 2 (r) := lj(r) Vr g Rj (the set of labels is unchanged over the elements of R) 

- l^Cc.r) := lj(c,r) V(c,r) g E^ 

- NF(u) ^ NF(v) because = Cj and 1^ = Ij on the concepts of u and v 

- (|)(v) := (|)(u) with a redundant predicate t(referent(r,u)) in (|)(u) was deleted 

- since r is redundant, = Cj, and l^ = Ij, then: 5(v) := 5(u) and 
value(v) := value(u). 

The Restrict Operator. Let u = <C„Rj,E„lj>, v = <C 2 ,R 2 ,E 2 ,l 2 >, and let NF(u) hold. If 
V is obtained from u by restricting concept c, = [tj:ij] in u to concept c^ = [tj:ijlO 
(with [tj:iJ being a specialization of [tj:ij], and l(Cj) being the new label of concept Cj), 
written v := restrict(Cj,Cj,u), then two cases arise: either 1) there is no c in Cj such that 
lj(c) = Kc^) or 2) 3c in Cj such that lj(c) = ^c^). In Case 1, the normal form assumption 
may not hold after the operation since two distinct labels may have the same referent 
(as a result of a restrict operation). In this case, additional steps (additional 
restrictions) may be needed to maintain the normal form assumption (see Case 2 
below). In Case 2, if NF(u) holds, then NF(v) will automatically hold since no new 



^ A relation r in u is said to be redundant if 3r' in u such that l(r) = l(r') and Vj g 
[ l,arity(type(r))l, concept(j,r,u) = concept(j,r',u). 

Concepts Cj and c^ can be seen as single concept graphs: <{Cj},0,0,l'> and <(Cj),0,0,l> 
respectively. 
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label is introduced by the restrict operation. So Case 2 is defined to automatically join 
concepts having the same label. This is often called the internal join operation in CG 
literature [ 4 ]. 

Case 1 : (there is no concept in u with the new label) 

- C,:=C. 

- Rz:=R. 

- E,:=E. 

- IjCc) := lj(c) Vc G C, such that c *Cj 

- l,(c.):=l(c,) 

- l2(c,r) := lj(c,r) V(c,r) g Ej 

- (|)(v) := (|)(u) where a (sub)formula (|)(Cj) of (|)(u) was substituted by formula (|)(C2) 

- since IjCCj) is a specialization of Ij(Cj), C^ = Cj, and l^ = 1, for all concepts except c,, 

then 5 (v) := {<i^^,i^^,...i^l^^l> | <ij,i2,---i|cJ> ^ 5 (u), j = index(c,v), k. = 
index(c,u), c = index'‘(j,v) = index ‘(k,u), and where <ip> g ( 5 ^(Cj,u) n 5 (cj>) for p 
= index(Cj,u)}. Then Definition 5 can be applied to compute value(v). 

Case 2 : (there is already a concept in u with the new label) 

- let c',c G Cj and Ij(c') = ^c^), then we define mj(Cj) := c' and m3(c) := c Vc Cj 

- Q:=C,\{cJ 
-Rz:=R. 

- E, := {(m3(c),r) | (c,r) g E^} 

- yc) := 13(c) VcG C.Mc,} 

- l^(r) := l,(r) Vr g R3 

- l2(m3(c),r) := lj(c,r) V(c,r) g Ej 

- NF(v) holds because NF(u) holds and l^ contains all the labels of Ij but one. 

- (|)(v) := (|)(u) where subformula (|)(Cj) was deleted. 

In order to define 5 (v) we now introduce a tuple reduction operator needed to 
eliminate a designated element in each tuple of its operand; this is useful for deleting 
a duplicate element. 

Definition 10. Let us define the tuple reduction operator that makes a copy 
of a tuple without the identified element). That is, if e = <ij, i^, ..., ij, ..., ij,>, 
then -^e = <ij, i^, ..., ij_j, ij_j„ ..., ij3>. So we can define: 5 (v) := {-|j<i„i2,...i|33 |> 

I <ij,i2,...i|c. |> G 5 (u), k = index(Cj,u), j = index(c',u), and i^ = ij^}, and 
value(v) can te computed according to Definition 5 . 

In terms of relational database operators, the restrict operator described here performs 
a select according to a restricting condition on some element of the tuples on which it 
is applied. 

The Join Operator. In this section we define the maximal join operator since it 
favors the enforcement of the normal form assumption over the resulting graph. Of 
course, after a join operation, two concepts could have different labels but the same 
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referent. As in Section 3 . 2.3 above, additional restriction operations may be needed to 
ensure that the newly joined graph is in normal form. At least, the maximal join 
operator joins all concepts which share the same label. Thus, we describe this 
operator below. 

If u = <Cj,Rj,Ej,lj>, V = <C2,R2,E2,l2>, w = <C3,R3,Ej,l3>, and NF(u) and 
NF(v) holds, and if X is the maximal join operator applied on u and v, two distinct 
graphs, then we can write: w = u X v. Let C = {c e Cj | there is no c' e C, with Ij(c') 
= IjCc)}. Then we have: 

- C3 := C, u C 

- R3 := R, u R3 

- Vc G CjVC, we define m3(c) = c' g Cj where Ij(c') = 13(c) 

- Vc G C3, we define m3(c) = c 

- E3 := {(m3(c),r) | (c,r) g Ej u E^} 

- 13(c) := 13(c) Vc G C. 

- 13(c) := 13(c) Vc G C 

- l3(r) := l3(r) Vr g R^ u R3 

- l3(m3(c),r) := yc,r) if (c,r) g E3 

- l3(c,r) := lj(c,r) if (c,r) g E, 

- (|)(w) := the conjunction of (|)(u) and (|)(v) where subformulae (|)(c) for all concepts 
c having duplicate labels were deleted 

Let J = {(jp,kp)}, a set of associated indexes. Let be the relational multiple join 
operator whose operands are tuples of arbitrary length. Tuples from each operand are 
joined (concatenated) whenever the j^”* element of the first tuple is equal to the kp“* 
element of the second tuple, /or a// p g [1, | J | ]. The concatenation is done by adding 
to the end of the tuple of the first operand, the non duplicate elements of the second 
operand, i..e, it is done in such a way to avoid duplicating the join attributes. That is 
the kp* element of the second tuple, for all p g [ 1 ,|j]] will not be part of the 
concatenated tuple. With our example, we have J = {(jp,k ) where jp = index(m3(Cp),u) 
and kp = index(Cp,v), VCp g C^C' and for p g [ 1 , | C^C' 1 ] } . If m = | C, | , we have: 
5 (w) := {</ ,/ ,.../ > I <ij,i3,.../> G 5 (u) 5 (v), j = index(c,w), k. = index(c,u) if c 

G Cj or kj = m+index(c,v) if c g C, c = index '(j,w)}, and value(w) can be computed 
according to Definition 5 . 



3.3 The Interpretation of Universally Quantified Concepts 

Universally quantified concepts represent all objects conforming to a certain type; 
they are used as a short-cut notation. Therefore, the extensional semantics of the CG 
notation does not depend on that notation. However, its use in a CG expresses a 
constraint that we describe here. If a universally quantified concept c = [t:V] appears 
as the j“* concept in a normal form graph u, i.e., index(c,u) = j e [l,n] where n = | C | , 
that means that there is no restriction when it comes to the instantiation of that 
concept with one of the objects in ( 3 (t), with regard to the truth value of u, as long as 
( 3 (t) is not empty. Consequently, if e = <ij, /, ..., f, ...,/> g 5(u), there should be 
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I 5(c) I -1 other tuples with the exact same elements except for the j“* element which 
should be a different element of 5(c) each time. So we have: satisfiability 

Assumption 6. (Scope Satisfiability Assumption) If the j”" concept c in a graph 
u is universally quantified, then for any e = <ij,i 2 , i„> in 5(u) let us define 

comp(e) = {e' = <kj,k 2 , k,, > e 5(u) where i^ = k^ for all p *j}. If the union 

of the j* element of each e' in comp(e) = 5(c), then we say that the predicate 
scope_satisfiability(u) holds (by default, scope_satisfiability(u) always hold 
for a graph u that does not contain any universally quantified concept). ^ ^ 

Definition 11. Let us now define the truth value of a normal form graph u as: 
truth_value(u) := value(u) a scope_satisfiability(u). 

With the definitions given in this paper, we showed how the truth value of a normal 
form graph can be computed (as the outcome of Definition 11) for any graph derived 
from the canon, based on the interpretation given by [3. Section 4 below discusses 
how the set of definitions given above can be extended to include the enforcement of 
semantic constraints. 



4 Extended Canons 

4.1 Semantic Constraints 

Our previous work on constraints [5,6] showed how to formalize all semantic 
constraints found in database literature today in terms of a simple yet powerful 
mechanism based on conceptual graphs and on the projection operator. For 
instance, if Uj and Vj are the graphs of Figures 1 and 2 below, ih&n false m, unless v, is 
constraint Cj stating that: no employee can work on a project unless s/he is assigned 
to that project. 




Fig. 1. A graph Uj which should always be false according to constraint Cj. 




Fig. 2. The exception graph (Vj) to the constraint of Figure 1. 



The advantage of having a scope_satisfiability predicate is that its definition may be 
extended to include other quantifiers without having to change the definitions given in this 
paper. 

1^ Since space is scarce, the reader is invited to read [3] as a more thorough introduction on the 
representation of semantic constraints under the CG formalism. 
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Among other things, we showed in [5,6] how to represent a relation type signature 
constraint using false-unless constructs. For example, the graphs of Figures 3 and 4 
represent constraint = false unless v^: no relation of type agnt can be used unless 
its first parameter is of type ACTION_VERB and its second parameter is of type 
PHYSICAL_ENTITY. 




Fig. 3. Graph u^, needed to represent constraint over the relation type agnt. 



PHYSICAL_ENTITY: *y 



ACTION_VERB: *x 



Fig. 4. Graph v^, which is the signature graph of the relation type agnt. 



By including constraints other than those encoded in B, this constraint representation 
mechanism allows the representation of knowledge according to a more precise 
model of the reality. Let FI be the set of all semantic constraints related to the domain 
to be modeled (with B c H). We can now extend the definition of a canon by using 
<T,I,:;,H> instead of <T,I,;:,B>, a more general definition of what a canon is. Of 
course, we now need to define how the truth value of canonically derived graphs. 



4.2 Validating Derived Graphs under an Extended Canon 

The only difference with what was defined so far, is that we now have in H additional 
conditions under which the interpretation of a derived graph must be evaluated. For 
each constraint Q of the form: false u. unless v,, we know that the support set 5(Uj) 
should be empty, otherwise the constraint would be violated, with the exception of 
elements in 5(Uj) that are also in the support set of the corresponding projection of u, 
onto Vj, i.e., also in 5/U;,v,), since these elements satisfy the unless clause of the 
constraint. Let us define the support set of a constraint C, = false unless v as: 5^(C,) 
= 5(Uj)\ 5/Uj,v,). We state below the constraint satisfiability assumption. 

Definition 12. (Constraint Satisfiability) If Q is a constraint in H and 5/Cj) = 

0, then Cj is said to be satisfied. Otherwise Q is said to be violated. 

Definition 13. If u is a graph that violates some constraint, then truth-value(u) 

:= False. 

Definition 14. If truth_value(u) = True according to Definition 11 and 
truth_value(u) = False according to Definition 13, then the CG system is said 
to be inconsistent. 
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5 Conclusion and Future Developments 

This paper presented the extensional semantics of the conceptual graph notation in a 
framework closely related to relational database (RDB) theory. The main motivation 
behind this choice is its subsequent facility of transfer from theory to practice. 
Furthermore, rapid prototyping, development and maintenance of CG based systems 
would highly benefit from well established theory and well tested tools such as those 
found in RDB literature. The database of a CG system, a set of existentially quantified 
assertions (graphs), is easily mapped onto the RDB model. Finally, years of practice 
under this model makes it a favorite with investors when a technological platform 
must be chosen for a large industrial project. Consequently, this paper favored this 
paradigm to provide an extensional semantics for the CG theory. It also described the 
compositional semantics of the canonical formation operators, along with a simple 
procedure to implement model checking. Right away one could think of applying 
RDB search heuristics to speed up model checking. We will leave that exploration for 
future work. 
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Abstract: The work analyses typical uses of attribute-like relations in conceptual 
graphs and identifies general sources of failure to convey intended meaning. 
These include relations that do not contain a semantic bond between their argu- 
ments, semantically superfluous concept types, and confusion between first-order 
and second-order levels. The analysis focuses on the semantics of physical quan- 
tities and results in their formal definition in terms of conceptual graphs. 



1 Introduction 

The inherent power and flexibility of the CGs formalism makes it potentially possible 
to formulate models of any semantic complexity. This power makes it particularly im- 
portant that we fully understand the semantics and logic of the basic CGs building 
blocks. Understanding the semantics of CGs models becomes particularly critical 
should we embark on their full implementation. This requirement for semantic rigor 
has forced an extensive theoretical analysis and re-evaluation of the semantics of a 
number of typical CGs examples. Attention was quickly drawn to the traditional 
treatment of attribute-like relations, which frequently failed to convey the intended 
meaning of the graph. The main causes for this appeared to be: 

Confusion about which parts of the domain knowledge should be represented by 
concepts, and which should map onto relations; 

Overuse or misuse of the attribute relation linked to sink concepts that do not repre- 
sent a valid space of attribute values; 

Mixing object-level and meta-level statements in the same model without clearly 
defining, or even understanding, which is which; 

Confusing a general property of an object of a given category with the space of the 
values for this property, and with the actual value of that property for a particular 
object. 

Such problems lead to semantic ambiguity or inconsistency and result in a loss of 
knowledge. This paper will examine, in some depth, typical examples of incorrect use 
of attribute-like relations and suggest solutions that will avoid such problems. 
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A few comments on our typography : Quoted graphs or formula examples are shown 
in Courier New. Any fragment of a graph or formula shown in this font retains its 
original meaning. Our own illustrative examples use Luci da Consol e. 



2 Knowledge Content of Types and Relations 



Let us illustrate issue a) above, starting with an example of the representational primi- 
tive of Prehension, [1] p. 85, Fig. 2.9, which represents a concrete fact of related- 
ness . It is sy mbolized by the relation ( Ha s ) , as in: 

[Mother]. (Has). [Child]. 

Graph 2.1 

Sowa s interpretation of this graph. Some mother has a child, is not justified if it means 
that the woman is the child s biological mother. The type label CHILD could have two 
different meanings. The one apparently implicit in the graph, assuming the relation 
(CHILD) , is given by: 

type child(x) is [PERSON], (child). [person:*x]. 

Definition 2.1 



In this case, the interpretation of [Child] is a person who is somebody’ s child. The 
type Child thus defined has practically no knowledge content. Every person is some- 
body s child, i.e. (. x)Child(x) = PERSON (x), and there is no semantic link to 
the defining parent . Such a link could be only established by a definition constructed 
along the following lines: 

type PAVEL ’S_CHILD(x) is [MANiPavel]. (CHILD). [PERSONL'X]. 



Definition 2.2 



The second interpretation of [CHILD] could be possibly given as a person who is less 
than 13 years old. Strictly speaking, CHILD is a role type, not a natural type, whereas 
PERSON is a natural type. As with Child, there is nothing in Graph 2.1 to show that 
the child has been bom to the given woman, even if it has. If a woman s child is now a 
grown-up man, the type label CHILD would not apply to him, and we would have to 
change Graph 2.1 to [mother] . ( Has) . [man] , which suggests an interpretation 
radically different from Some mother has a child. Obviously, one could drastically di- 
lute the original interpretation to There exists an individual of type Mother and an indi- 
vidual of type Child that are related by relation Has. Because it is so heavily over- 
loaded, Has contains very little meaning on its own. It could stand-in for many valid 
relations between two concepts, but we would not know what the resulting graph 
would mean. In the graph 




Graph 2.2 
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we can deduce from additional domain knowledge that Tom is not Octavia s child. We 
still do not know, though, if Tom is Octavia s slave, or whether she had him for dinner 
as a guest or as the main course. There is another aspect of (Has ) that further compli- 
cates its semantics, namely, its strong temporal or ACT-like content. This means that 
Has is not a relation, but a state or a mapping from an act into a state, i.e. Has<STATE 
or Has<ACT, depending on the actual situation. We suggest either defining HAS rigor- 
ously for dedicated, specific use, or avoiding it altogether. 



2.1 The Spurious Sink Concept 

Relation (ATTR) may or may not have a temporal component, but it is even more 
overloaded than (Has) and suffers from the same semantic confusion, demonstrated 
by the next graph: 

[PERSON]. (ATTR). [BIRTHDATE] . 

Graph 2.3 

Let us define BIRTHDATE as the date of a day when somebody (anybody) was bom. 
Similar to Graph 2.1, a [BIRTHDATE] may or may not be associated with a given 
person. There is more. Obviously, BIRTHDATE < DATE. Whereas DATE is a natural 
type, BIRTHDATE is a role type. Within the temporal limits of human existence, each 
day has been the birth date for some person. Thus, for the past x million years and, 
hopefully, a long way into the future, every date is a birth date: 

(V x) DATE(x) = BIRTHDATE (x) 

You may notice that BIRTHDATE and Child (as defined in Def 2.1) fall into the 
same category, namely, that of role types that do not add any knowledge to the model. 
If we remove BIRTHDATE from our ontology. Graph 2.3 will become 
[person]. (ATTR). [DATE] . Because ATTR is so weak in this interpretation, 
the graph does not say what the significance of the date to the person is: it could be the 
date of their dentist appointment, graduation, wedding, death, etc. We can solve the 
problem simply by using the relation (DATE_0F_BIRTH) . This relation, either speci- 
fied as primitive or explicitly defined, contains enough knowledge to represent the re- 
quired link between the natural concepts [person] and [DATE]: 

[PERSON]. (DATE_OF_BIRTH). [DATE]. 

Graph 2.4 

We suspect that some of the semantic confusion analyzed above stems for the conven- 
tional interpretation of database tables such as the one below, where the attribute 

PERSON BIRTHDATE 

Tom 10/03/1960 

Dick 20/04/1970 

Harry 20/04/1970 
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name BIRTHDATE is effectively re-defined for each row, i.e. TOM’ S_BIRTHDATE, 
DICK ’ S_BIRTHDATE, etc. This is possible because of the ordering of the table, 
which actually implies relation (DATE_OF_BIRTH) of graph 2.4. Because a CGs 
knowledge base, being a conjunction of monadic and generally n-adic predicates, is not 
structured in this way, such relations must be made fully explicit. 



2.2 ‘Conceptualness’ and ‘Transitiveness’ 

This problem results from using the (Attr ) relation in situations where it cannot have 
a meaningful sink concept, as shown by examples from [1], p. 434 and [2]: 

[Relation]. (Attr). [Transitive] 

[Graph:]*}]. (Attr). [Conceptual]. 

Type labels normally correspond to nouns or verbs of natural language. However, the 
type labels of the sink concepts above correspond to adjectives. Adjectives behave ei- 
ther attributively (e.g. rich man ) o r predicatively ( th e man is rich ). They normally 
cannot stand on their own, nor can the concepts [Transitive] and [Concep- 
tual] . This becomes more apparent if we try to instantiate them. The reading of 
[Transitive : # ] , [Conceptual : # ] would be *the transitive and *the concep- 
tual. We could use types TRANSITIVENESS or CONCEPTUALNESS, but then we 
would have to ask how many different values of transitiveness or conceptualness there 
are. Because concepts with such types cannot be instantiated, we conclude that these 
types are not semantically valid components of our ontology. 



2.3 First-Order v. Second-Order Ambiguity 

[Ski:#]. (Chrc) . [Length]. (Amt). [Measure : <167 , cm>] . 

Graph 2.5 

In [1], p. 32, Sowa describes (Chrc) as a second-order relation that links an object to 
a characteristic, as in [Ball]. (Chrc). [Color: Red]. In comparison with 

[Color: Red], [Length] does not look like a second-order concept. If it were, it 
would have to have a first-order predicate as its referent, and the type label would have 
to be a second-order predicate, such as LENGTH_TYPE. We could suggest 
[LENGTH_TYPE : great_l ength] as a possible solution, but then the type label 
GREAT_LENGTH cannot be linked to the first-order relation (Amt) that links it to the 
first-order concept [Measure] - a type cannot be measured! 

Let us accept that [Length] is a first-order concept, even though it means that [1] 
offers two contradicting interpretations for (Chrc) . Now we may ask what its refer- 
ents would be (as we would ask in the case of [GREAT_LENGTH]). Because 
[Length] is assumed to be independent of any actual measurement, its referents 
would have to be standard individual identifiers, each standing for a single length that 
can be associated with any number of physical objects and their dimensions, distances. 
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etc. Then an individual [Length : # ] could be mapped into a number of equivalent 
values, each belonging to a different measurement system (e.g. imperial, metric, etc.). 
There are several considerations suggesting that [Length] and similar concepts are 
superfluous. 

As there are infinitely many different lengths, there would have to be infinitely 
many individual identifiers that would conform to the type Length. This set of identi- 
fiers would have a one-to-one mapping, for a single measurement system, into the set 
of real numbers. Thus, these identifiers would be equivalent to non-negative real num- 
bers, effectively duplicating them. This is neither necessary nor desirable. We can use 
real number concepts directly as the values of the ‘length attribute’ and connect them, 
through a suitable relation, to the concept of the object whose length we are modeling. 

Another reason why we question the representational value of [Length] is that, 
arguably, no representation of a physical quantity can be fully independent of an actual 
measurement, whatever form it might take. Even comparing which of two sticks is 
longer, say, involves mapping them to viewing angles, which are ‘measured’ by the 
number of relevant photosensitive cells on the retina, and by using distance clues. 

We conclude that [Length] , as used in Graph 2.5, is superfluous. We reach the 
same conclusion with concepts such as [Speed], [Temperature] , etc. 



3 Semantics of Quantities 

Another source of potential semantic ambiguity or inconsistency in a CGs-based 
model, with respect to modeling properties, is the use of ‘complex’ concept referents, 
especially those associated with quantities. To illustrate this problem, let us analyze in 
some detail the concept [ Salary] , taken from an example in [1], p. 34: 

[ Person : Tom] (Agnt) [Earn] ^ (Thme )—»[ Salary : 0 $30, 000] . 

Graph 3.1 

Because the unique naming convention demands that the concepts representing actual 
salaries of different people be associated with different individual identifiers, Sowa 
adds a unique identifier into the referent field of [Salary] , producing [Salary: 
#78902 0 $30, 000] . This is different from the concept [Salary : #41337 0 
$30,000] of a different person. Sue, who earns as little as Tom. 

The O operator maps concept types to monadic predicates of FOL, and concept ref- 
erents to corresponding predicate arguments. Before applying O to [ Salary :...], we 
will slightly alter one of its ‘referents’ by separating the semantic components of the 
value $30 , 000, getting [Salary: #78902 @ $ 30 , 000] . This translates into 
the predicate Salary (#78902 @ $ 30,000). We can see that the type Salary 
corresponds to a tetradic predicate with arguments #78902 , $, and 30,000. 

Actually, Salary is effectively a second-order predicate, as ® and $ subsume or im- 
ply first-order binary relations. To get an individual concept for each of these referents, 
we will decompose the complex structure of [Salary] into a graph whose concepts 
map into strictly first-order monadic predicates, and which has only first-order 
relations. 
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Individual identifier: Let us assume that Tom is paid in cash. Then we can think of 
the individual identifier #78902 as the surrogate for the actual bundle of all the bank 
notes and coins that Tom has received in his pay packets during the year: 
[MONEY_BUNDLE:#78902]L Sues pay, [MONEY_BUNDLE : #41337] , con- 
tained different notes and coins, but their total value was the same. 

The @ symbol: Semantically ® represents a binary relation that links a bundle of 
money to its financial value expressed in terms of a unit of currency. We will replace @ 
with the relation (VALUE) , leaving the sink concept still underspecified: 

[MONEY_BUNDLE: #78902]. (VALUE), [monetary quantity^ . 

Graph 3.2 

Numerical value: Clearly, 30 , 000 is a number: [NUMBER: 30,000]. 

Currency unit: After detaching the $ symbol from the 30,000 and replacing it 
with the more specific US$, we can represent the latter as an individual referent that 
conforms to the type label CURRENCY_UNIT. To define a type label representing a 
monetary quantity in terms of a currency unit, we need a relation that would link the 
concept [CURRENCY_UNIT] to [NUMBER] . To get it, we will look for inspiration 
to the domain of physical quantities. 

A physical quantity consists of a scalar that has a dimension, (DIM) , given by a unit 
of measurement, where QUANT<NUMBER. A monetary quantity has the same structure: 

type PHYS_QUANT(x) i S [NUMBER : *x] . (DIM). [PHYSICAL_UNIT] . 
type MON EY_QUANT(x) i S [NUMBER : *x] . (DIM). [CURRENCY_UNIT] . 

To produce a direct equivalent to the conventional notation, as in $30,000, we recon- 
nect the numerical and dimensional (currency) information, using a type definition: 
type us$_QUANT(x) is 

[NUMBER:*x]. (DIM). [CURRENCY_UNIT : US$] . 

Definition 3.1 

Obviously, US$_QUANT < NUMBER. However, the purpose of this type is not to 
show whether it applies to any particular number al 1 (non-negative) numbers with a 
relevant number of decimal places will conform to US$_QUANT. The purpose of this 
type label is to carry the measurement dimension information for use in numerical 
comparisons and arithmetic operations. Now we can formulate the following graph: 

[MONEY_BUNDLE: #78902]. (VALUE). [US$_QUANT : 30 , 000] . 

Graph 3.3 

Why did we use the type label MONEY_BUNDLE instead of Salary, as in [Sal- 
ary:#78902]? The fact that a sum of money was paid as a salary is coincidental a 



' Obviously, this would not work for EFT (Electronic Funds Transfer). We could differentiate 
between a physical bundle of money and electronic bundle of money . 
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bundle of money could be used for a loan, bribe, ransom, it can be stolen or won in a 
poker game. Used as a type label. Salary represents a role for a bundle of money, as 
opposed to the natural type MONEY_BUNDLE. However, defined as an attribute rela- 
tion, SALARY can be used as follows: 




Graph 3.4 

Apart from the role type vs. natural type dichotomy, the value of a salary is dissociated 
from any actual bundle of money because the payments are spread throughout the year. 
A new employee may have not received any money yet, but the monetary value of 
his/her salary is elearly specified in their contract. The relation (SALARY), as used in 
Graph 3.4, is not associated with such problems. 



4 Semantics of Attributes and Values 

In [1] are presented three examples of color representation (p.32. Fig. 1.9): 

(Red). [Ball]. 

Graph 4.1 

The relational label Red means here red object, not red color. Graph 4.1 translates into 
(. x:Ball)Red(x). Using RED_OBl instead of Red, we can translate this typed 
predicate calculus formula into an equivalent FOL statement 
(. x) {BALL(x)aRED_OBJ (x) }. Graph 4.1 eouldbe also re-expressed in typed cal- 
culus as (. X : BALL) (. y : RED_0B3 ) x=y . Using O, both these formulas map back 
into CGs as [BALL] -- [RED_OBl] . However, Graph 4.1 can be also reformulated in 
typed caleulus as (. x: RED_OB3)BALL(x), which maps back into CGs as 
(ball). [RED_0B3]. We may notice here that Sowa s Red or our equivalent 
RED_0B3 do not explieitly represent a red color. The semantic correctness of these 
types is implied by the semantics of the color concepts used in their definitions. 

[Ball]. (Attr). [Red]. 

Graph 4.2 

In this graph. Red obviously means red color, not red object of Graph 4. 1 . The coneept 
[RED_C0L0R] can represent any physically possible shade of red, as defined by its 
hue, saturation and brightness (or using any other relevant eolor space system). We can 
think of a referent of the concept [COLOR] as of the paint eode embossed on a car 
identification plate. Like (Has), (Attr) is a very general relation. This generality 
negatively affects the power and computational efficiency of our queries. For example, 
if we wanted to find in the knowledge base the ball s color, we would use as a query 
the graph [BALL]. (Attr). [COLOR]. The system would then have to project 
[COLOR] on every concept attached to [BALL] by (Attr) . This projection would 
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involve any other types of attribute value spaees (for size, weight, price, etc.)- Such 
matching would include costly accesses to the type lattice, or less costly type code 
comparisons. As we shall see later, using (Attr) in this way would also make it dif- 
ficult to formulate meta-level definitions of, or queries about, the types of attributes, as 
opposed to their values. The solution to these problems is, as before, to put maximum 
information on the relation between the two relevant concepts: 

[BALL]. (COLOR). [RED_C0L0R] . 

Graph 4.3 

For any non-colorless physical object, a more general graph would apply: 
[PHYS_OB2]. (COLOR). [COLOR] . Here the label COLOR appears in two com- 
pletely different roles, as a relational label and as a type label - we could name the rela- 
tion (C0L0R_0F). With a less human-oriented view of color identification, we could 
replace [COLOR] with a concept representing a measurement of light wavelength, 
brightness, etc., which would remove this apparent dual role. 

Using COLOR as a relation, as in Graph 4.3, if we wanted to know the color of our 
ball, we would use the query [BALL]. (COLOR). [T] . The system would directly 
retrieve the sink concept of (COLOR), without having to access the type lattice or 
make type code comparisons. Let us consider the following graph: 

[Ball]. (Chrc). [ColoriRed]. 

Graph 4.4 

Hypothetically, there could be two possible interpretations, depending on whether we 
define (Chrc) as (i) taking exclusively first-order concepts for arguments, or (ii) tak- 
ing a second-order concept as its second argument. 

First-order Color: We can interpret [Color: Red] as a first-order concept. 
Then (Chrc) would have the same meaning as (Attr) , and there could be only one 
shade of red color in the whole domain, e.g. as if modeling traffic lights. The referent 
Red is the individual identifier of this color. 

Second-order Color: [1] interprets Graph 4.4 as there exists a ball, Red is a color, 
and the ball has Red as its characteristic . Thus, Color is a second-order type and, 
presumably, the characteristic Red has some actual value., e.g. [Red: # 4385 ], 
which is not shown. Then, to quote, (Chrc) relates an entity to a second-order charac- 
teristic like color, size or weight . To avoid confusing this second-order Color with 
the first-order type label COLOR, as in COLOR > RED_COLOR > CRIMSON, and 
[CRIMSON : # 3458 ] , we should use the label COLOR_TYPE. Then Graph 4.4 be- 
comes: 

[BALL]. (Chrc). [COLOR_TYPE: red_COlor]. 

Graph 4.5 

We might ask: What was the question answered by Graph 4.5? There are three possible 
questions that we could ask in the context of object properties: 

a) What are the general properties that an object (or a category of objects) has? 

b) What are the concrete properties (attribute values) of a given object? 

c) What types of values are associated with a given property (attribute)? 
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Fig. 4.1. Illustrating aspects a) c) above on an example represented in UML 

The answer to question a) about pump p 3 84 9 2 would consist of the first column in the 
attribute compartment of the object box in Figure 4. 1 . Using another domain: a geomet- 
ric point has a position in a co-ordinate system, but no length, width, thickness, color, 
mass, specific heat, or conductivity. A solid sphere will have diameter, mass, color 
(unless made from a completely transparent material), but no length, width or height 
(unless we define these from diameter), etc. By asking question b), we want to ascer- 
tain where, given a space of values, an object stands with respect to a property identi- 
fied by a). For our pump, the answers are in the second column. Question c) determines 
what the space of the values of a specific property identified by a) is, e.g. for manu- 
facturer the space is the set of company names, for wei ght it is the set of non- 
negative numbers, each multiplied by a unit of mass, for axi s of rotati on it is 
the set {hori zontal , vertical}. 

We can see that Graph 4.5 does not answer question b), which asks about attribute 
values, because Graph 4.5 does not contain a first-order concept [RED_COLOR] that 
would hold such a value in its referent. We suggest that Graph 4.5 attempts to answer 
question a). It does not quite succeed, because the sink argument of (Chrc), 
[COLOR_TYPE : red_COl or] , clearly cannot represent the relation type COLOR, 
which we used in Graph 4.3. Instead, it represents a restricted space of values (i.e. 
value type) for the attribute COLOR. The individual attribute relations, the different at- 
tribute value spaces and nested subspaces serve as cognitive navigation points on a path 
(represented by the arrow lines) leading from an object to the values of its properties. 
We do not have to reach the ultimate value constant, but every step on this path will 
give us more specific information about the properties of the object. We may illustrate 
these differences by the following diagram: 





244 Pavel Kocura 



Attribute Relations 



Spaces of Attribute Values 



Spaces of Attribute 




Fig.4.2 Attributes are relations, which point to progressively more focused spaces of attribute 
values 

Inspired by this metaphor, we will interpret the relation label ATTR as representing the 
first step and define it formally. First, however, let us outline our development of a 
formal representation of physical attributes and quantities in CGs, based on the ideas 
discussed in previous sections. 



5 Representing Physical Quantities 

First we need a type for non-negative numbers: 

type 0<NUMBER(x) is [NUMBER:*x]. (=). [NUMBER:0]. 
type m_QUANT(x) is 

[0<NUMBER:*x]. (DIM). [LENGTH_UNIT : ttl] . 

Definition 5.1 An m_QUANT is any non-negative quantity whose dimension is the meter. 

Obviously, LENGTH_UNIT :: cm , km, mm, mile, etc. We need to use physical 
quantity units explicitly only when defining primary quantities: length, time, etc. To 
define derived quantities, we use u_QUANT types, where u is a physical unit. 

type m^_QUANT(x) is 



m_QUANT 




m_QUANT 


2 — X^^ULTIPL'T)) — 3^ 


NUMBER: *X 



Definition 5.2 A quantity in m^ is the product of multiplying two quantities in m. 
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We could have defined m2_QUANT using the derived unit [AREA_UNIT : m^] , which 
itself could be defined as an individual. The following graph defines a time quantity: 

type s_QUANT(x) is [0 <number:*x], (dim). [time_unit: s] . 



type ms ^_QUANT (x) i s 




Deflnition 5.3 A length quantity (m) divided by a time quantity (s) produces a velocity quantity 
(m/s^). 

For the arcs of the relation (DIVIDE) , 1= Dividend, 2= Divisor, 3= Quotient. 



type ms ^_QUANT(x) is 




Deflnition 5.4 Acceleration is a quantity produced by dividing velocity (m/s) by time (s) 

All the SI (or any other) units and corresponding physical quantities can be represented 
in CGs this way. This representation enables us to define and implement constraints on 
mathematical operations on physical quantities. 



5.1 Arithmetic Operations with Physical Quantities 

All the definitions of physical quantities, as well as the knowledge representation of 
attributes and attribute-like relations, can be fully deployed using the OWLS CGs theo- 
rem prover. The following OWLS script example formulates the rule that different 
physical quantities cannot be added together. 




Graph 5.1 If there are two different subtypes of the type PHYS_QUANT then there does not 
exist a number that is the result of the sum of these quantities.^ 



^ This does not include physical quantities that use only different units of the same type, e.g. me- 
ters, centimeters, miles, and thus can be numerically converted into each other. 
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6 ATTR as a Second-Order Relation 

In the previous sections, we analyzed graphs, some containing the relational label 
Attr, that failed to carry their intended message. During the transformation of these 
graphs into more meaningful ones, the relation (Attr) invariably disappeared from 
the result. However, we might feel that there is a role for such a relation, especially in 
the context of the question What are the permanent, ‘inalienable’ properties (attribute 
relations) associated with an object? We will use ATTR to play a role similar to that 
intended for Sowa s second-order Chrc. Using our example of Graph 4.3, we suggest 
that (attr) is a second-order relation, with COLOR as the referent of its sink argu- 
ment, [RELATI0N_TYPE : col or] . But what is the source argument? If we used a 
first-order concept, e.g. [BALL], the following graph would assert the existence of a 
ball that is somehow associated with the relational label COLOR: 

? [BALL]. (ATTR). [RELATION_TYPE : COlor]. 

Graph 6.1 

That may not seem satisfactory for a several reasons: 

1) We may want to say that all balls have some color, not that there only exist some 
colorful balls. (Obviously, we could universally quantify over [BALL] ); 

2) We may want to describe the inheritance of properties from supertypes to subtypes; 

3) We may want to keep connected graphs within FOL. (In Graph 6.1, the relation 
(attr) connects the object model with its meta-level). 

We suggest that a possible solution is for (ATTR) to use second-order concepts for 
both its arguments, as in: 




Graph 6.2 

You may notice that the lines of identity connect between the referents of second-order 
concepts to the type labels of first-order concepts. The following graph shows the defi- 
nition, by logical equivalence in terms of first-order concepts and relations, of the rela- 
tion (attr) . (We could equally well use a . -abstraction). 
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B 
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A 


A 














V 

















Deflnition 6.1 Notational shorthand for Peirce logic equivalence 
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Definition 6.2 Type A has attribute R iff for every instance of A there exists an instance of B that 
is connected to the instance of A by R. 




Graph 6.3 This shows the relations between attributes and attribute value spaces 




Graph 6.4 If type A is a supertype of type B and A has attribute R then B has attribute R 

The definitions of physical quantities and of the (ATTR) relation enables us to re- 
formulate Graph 2.5, [Ski: #]. (Chrc). [Length], (Amt). [Measure: <167, cm>]. It 
breaks down into: 

(i) A first-order statement that answers the query How long is the ski? 

[SKI:#]. (LENGTH). [cm_NUMBER: 167]. 

(ii) A second-order statement, which is part of the answer to the request Give me 
one property that all skis have. 



[TYPE: ski]. (ATTR). [REL_TYPE: length]. 
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According to Def. 6. 1 , second-order relation (ATTR) points to the first-order attribute 
relations shared by a whole class of objects. However, we may want to show the unique 
attribute relations of a unique individual object. For that we would need to use a rela- 
tion similar to Sowa s second-order (Chrc), or the version of (ATTR) shown in 
Graph 6.1. In that context, we might ask whether a completely ad-hoc relation that is 
not part of the semantic space of a fully defined type, i.e. one that is unique to an indi- 
vidual concept, can be classified as an attribute. This and other questions concerning 
meta-level descriptions of conceptual graphs will be addressed in another paper. 



7 Conclusion 

Our study of the semantics of attribute-like conceptual relations was necessitated by the 
need to define them rigorously and reason with them formally. We have attempted to 
identify, categorize and suggest solutions to some of the typical causes of semantic de- 
ficiencies, repeatedly found in publicly accessible examples of conceptual graphs. 
These causes included the use of weak relations that do not provide a sufficient seman- 
tic bond between their arguments, e.g. Has and, to a large extent Attr. Another 
source of problems was the use of types for the sink concepts of such relations that did 
not contain much or any relevant knowledge, e.g. [BIRTHDATE], [Child], 
[Length] , [Transitive] , etc. Based on our analysis, we have defined a formal 
representation of physical quantities, and shown a general approach to defining second- 
order relations from universally quantified first-order statements, as used in the defini- 
tion of the (ATTR) relation. This work seems to have uncovered a number of interest- 
ing problems, which need to be explored further, both theoretically and empirically. 
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Abstract. In this paper, a contextual semantics for nested concept 
graphs shall be presented, that has especially two aims: We will propose 
a situation-based semantics that fits with Sowa’s understanding of con- 
texts and draw the connection to formal concept analysis by founding on 
formal contexts, more precisely triadic power context families. Its basic 
idea is to understand interpreted nested concept graphs as situation- 
based judgements in triadic power context families. With it, the logical 
theory can be well developed and inferences can be characterized in three 
ways: By semantical entailment, by means of a standard model and by 
a sound and complete calculus. 



1 Introduction 

Several papers have been presented that develop the language of concept graphs 
over power context families in order to combine the approaches of formal concept 
analysis with the theory of conceptual graphs ([Wi97], [Wi98], [Pr98a], [Pr98b], 
[PW99]). The main idea of the logic approach presented in [Pr98b] was to de- 
fine simple concept graphs as syntactical constructs with a semantics in power 
context families. 

In this paper, an important extension for the language of conceptual graphs 
shall be included: the nestings in concept graphs. With nested conceptual graphs, 
knowledge can be structured and the coded information can be referred to differ- 
ent situations. A typical example is the concept graph shown below. It is taken 
(in a slightly changed form) from [MC96] . 

Its intuitive semantics is explained in the following way: The right answer 
to the question “Is there a boat on the Mediterranean Sea?” would have to be 
“Yes, in the context of picture A.”, because the information given in the nesting 
of picture A only has a meaning in the context that is determined by A (cf. 
[MC96, S. 46]). Thus, this intuitive semantics is based on the notion of context 
as it is formalized in McCarthy’s theory of contexts [McC93] . It is related to the 
notion of situations introduced by Jon Barwise and Jon Perry [BP83]. 

The discussion about a formal interpretation for nested conceptual graphs is 
not at all finished. Important contributions have been made by John Sowa [So84] 
(who developed a logical interpretation that exceeded the first-order predicate 
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PICTURE: A 



BOAT: #2 



— © 



SEA: Mediterranean 



logic), Anne Preller [PMC95] (who developed a special logical language with 
a system of Gentzen sequents) and Genevieve Simonet ([Si96], [GMS98]). She 
proposed a logical interpretation in first-order logic where the predicates are 
extended by another argument describing the context. All these approaches only 
provide a logical syntax, but no explicit extensional semantics. 

In this paper, a contextual semantics for nested concept graphs shall be pre- 
sented. It has especially two aims: We will propose a situation-based semantics 
that fits with Sowa’s understanding of contexts (cf. [So97]) and draw the con- 
nection to formal concept analysis by founding on formal contexts. Thus, the 
semantics is contextual in two senses: It is based on Sowas’s understanding of 
contexts and uses the basic ideas of situation semantics, and it provides a se- 
mantics in formal contexts in the sense of formal concept analysis. That is why 
we call it a situation-based, contextual semantics. 

As a foundation for the situation-based approach, we will introduce situa- 
tions into the language of formal contexts in Section 2. Generalizing the logic 
approach for simple concept graphs (cf. [Pr98b]), we will define nested concept 
graph as syntactical constructs with a semantics in triadic power context fam- 
ilies with situations in Section 3. In Section 4, we show that the logical theory 
of nested concept graphs can be developed as in the case of simple concept 
graphs. In particular, we can give three characterizations of inferences, a model- 
theoretic notion of entailment, the validity in the standard model and a sound 
and complete calculus for a syntactical characterization. 

2 Situations in Triadic Power Context Families 

A convincing formalization for situations was first given in Jon Barwise’s situa- 
tion theory [BP83] and has developed to the growing field of situation theory and 
situation semantics. Here, we follow Keith Devlin who starts his situation-based 
theory of information in [De91] with the formalization of the smallest units of 
information, the so-called infons. They look for example like this: 

<C marries, Bob, Carol, New York, 11/12/58, 1 ^ 

It usually consists of a k-ary relation (here marries), k objects (here Bob, 
Carol), some locations (here New York, 11/12/58) and the polarity 1 (if the 
objects are in the relation at the given location) or 0 (if not) (cf. [De91, S. 35]). 

Infons can refer to situations, then it is said “an infon is made true by a 
situation”. As it is difficult to formalize real situations, Devlin proposes the 
definition of abstract situations as sets of infons. 
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These notions infons and abstract situations can easily be translated into the 
language of formal contexts: Using triadic power context families, an infon is an 
instance of the incidence relation of a context and a situation is a sub family 
of a power context family. For explaining this idea, let us first recall the formal 
definition of triadic contexts and triadic power context families as they are given 
in [LW95], [Wi95] and [Wi97]. 

Definition 1. A triadic (formal) context is a quadruple {G, M, B,Y), where 
G, M and B are sets whose elements are called (formal) objects, attributes 
and conditions, resp., and Y is a ternary relation between G, M and B where 
(g,m,b) G Y can be read as “object g has attribute m under condition b”. 

In these triadic contexts, an infon <C R, g,b,l ^ (where i? is a unary relation, 
g an object and b a location) can be understood as an instance of the ternary 
incidence relation U of a triadic context. Thus, we consider the locations to be 
conditions of a triadic context. If relations of higher arity shall be expressed, we 
must consider a family of triadic contexts with identical sets of conditions where 
the sets of objects consist of tuples of objects. For it, we need triadic power 
context families. 

Definition 2. A triadic power context family K := (Kg, . . . , K„) of type 
n (for annGfj) is a family of triadic contextsKk := {Gk, Mk, B,Yk) that satisfy 
Gk Q (Go)^ for each k = 1, . . . , n. Then, we write K = (Gfe, Mk, B, Yk)k=o,...,n- 

In the language of triadic power context families, the infon R, gi, ■ ■ ■ , gk,b,l ^ 
can be expressed by “{{gi, . . . , gk), R,b) G Yfc”, and < R, gi, . . . , gk,b,0 » by 
“((ffi) ■ ■ ■ , 9 k),R,b) ^ Ffc”. (If the context is not considered to be dichotomic, we 
do not allow infons with polarity 0). 

The notion of an abstract situation (being a set of infons) can be translated 
into the language of contexts as a sub family of a power context family. In the 
most general approach, arbitrary sub families were considered (cf. [Pr98a], but 
in this paper, we will restrict to so-called simple situations, that are sub families 
being one “layer” of the triadic power context family. 

Definition 3. A (simple) situation in the triadic power context family K := 
{Gk,Mk, B,Yk)k=o,...,n is a power context family §t, := {Gk, Mk, {b}, Ik)k=o,...,n 
satisfying b G B and Ik = YkC\ {Gk x Mk x {&}) for all k = 0, . . . ,n. 

In this formalization, there is a natural translation for infons and situations 
into the langauge of formal contexts: infons are the elementary judgements by 
which the formal contexts of a power context family and the situations are 
constituted. For being able to talk about the richer conceptual structure of a 
power context family, we must provide concepts of situations and judgements of 
the form “the object g falls under the concept c in the situation s”. Therefore, 
situative concepts are defined in [Pr98a] as triadic concepts of the sub families, 
i. e. of the situations. For this paper, we can simplify in the following way: As 
each simple situation only has one condition, we can consider situations like 
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dyadic formal contexts having “dyadic” formal concepts that are called situative 
concepts. 

The units of thinking we work with are the so-called distributed concepts 
consisting of one situated concept in each situation. Thus, a distributed concept 
is a tuple of situative concepts that formalizes the continuity of concepts in 
different situations: 

Definition 4. A situative concept of order k (for a k G {0, . . . ,n}) of a situation 
§6 := {Gk,Mk,{b},Ik)k=o,...,n is a tuple (^ 1 ,^ 2 ) with Ai C Gk, A 2 C Mk 
where (Ai,A 2 ) is maximal with respect to the set inclusion under all triples with 
X ^2 X {6} C Ik- For t := (Ai, ^ 2 ); we call Ext t := Ai the extension oft and 
Int t := A 2 the intension of t. The set of all situative concepts of S{, is denoted 

m)- 

A distributed concept of order k of the triadic power context family K is o 
tuple (tb)beB with tb € $(§&) for all b € B (and for a k € {0, . . . , n}^. For two 
distributed concepts (4)sgs and {Ub)b^B, we define the order < by 

(4)heB ^ {ub)beB ■ ybe B : Ext (4) C Ext (ub) 

( yb G B : Int (h) 3 Int (ub)). 

As usual, we define the projection of the tuple onto its components by 
7rb((tb)&es) •= ffc =• 7’"Si,((fb)beB)- It is an important operation, because we do 
usually not talk about the whole tuple of a distributed concept, but only about 
some situative concepts of some specific situations. 

One important special case for distributed concepts are those that are inten- 
sionally defined (as Wille proposed implicitly in [Wi98]). Formally, we define: 

Definition 5. Let K := {Gk, Mk, B, Yk)k=o,...,n be a triadic power context fam- 
ily. For a k = 0, . . . ,n and N C Mk, the intensionally defined concept (N) (of or- 
der k) is defined by the largest distributed concept {tb)beB satisfying N C Int (tb) 
for all b G B. 

For this definition, we use implicitely that for each situation Sb, the extension 
of the intensionally defined concept (N) satisfies the following condition: 

Ext 7Tb((iV)) = 7 V^*^n(G,xMfcx{b}) _ {geGk\ {g,m,b) G Yk for all m G N} 

Those introduced notions will be needed in the next section to define the seman- 
tics for the language of nested concept graphs. 

3 Syntax and Semantics for Nested Concept Graphs 

According to the classification proposed by Chein and Mugnier in [CM97], we 
will consider untyped, positive, nested concept graphs without generic markers. 
That means, we do not allow negations, and the nestings are not specified by 
types. 

The syntax of the language for nested concept graphs will be defined as it is 
done in [Pr98b]. We start with an alphabet of object names, concept names and 
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relation names and define the concept graphs as the well-formed formulas over 
this alphabet. 

Definition 6. The alphabet for nested concept graphs is a tuple A := (Q,C,TZ) 
where Q is a finite set 0 / object names, (C,<c) is a finite ordered set 0 / concept 
names, and {TZ, <-ji) is a finite set that is partitioned into finite ordered sets 
{TZk,<nk)k=i,...,n 0 / relation names. 

In accordance with [Pr98b], the structure of concept graphs is described by 
means of nested directed multi- hypergraphs and labeling functions. We present 
the formal definitions and give an explanation and an example below. 

Definition 7. A nested, directed multi-hypergraph (of type n) is a quadruple 
(V, E, V, a), where 

• V and E are finite sets 0 / vertices and edges, resp., 

• 1 ^ : E ^ \J is a mapping {with a n> 2), 

{For e £ E with v{e) = (ui, . . . ,Vk), we denote k'{e)\^ := Vi and 
e'' := {v £ V \ 3i = I, ... ,n : v{e)\. = u}. ) 

• a\V ^ ‘^{V) is a mapping such that, for all e £ E and v,w £V with v w, 
the conditions e'' C V\a{w) or e'^ C a{w), a{v) fl a{w) = 0 and v ^ a™{v) 
are satisfied for all m £ N"*". 

By the triple {V, E,v), a directed multi-hypergraph is described (the mapping 

V assigns the incident vertices to each directed edge). A vertex that is not incident 
to any edge is called an isolated vertex. The mapping u describes the nestings: 
If u G cf{w), we say, v belongs to the nesting of w. The vertices with nestings, 
i. e. the vertices with a{w) yf 0 are called complex vertices. All vertices which 
belong to any nesting are called inner vertices, the vertices with v ^ a{w) for all 
w £ V are called outer vertices. The conditions for a make sure that the nestings 
are closed with respect to the edges and that the nestings do not intersect and 
do not have cycles. The following denotations will be used: dom a := {v £ 

V I a{v) yf 0}, o-~^{v) := w for w £ V with v £ cr(w), and cr~^{v) = 0 for 

V £ Utogt A nalogically cr“^(e'^) := w for w £ V with C a{w), and 
cr-i(e'^) := 0 for C V\ Uu;gv 

With these notions and notations, a syntax our language can be specified by 
defining the structure of nested concept graphs. 

Definition 8. A nested concept graph (of type n) over the alphabet A := 
{Q,C,TZ) is a structure 0 := {V,E,v,a,n, p) where 

• {V,E,v,u) is a nested, directed multi-hypergraph (of type n), 

• K,-.V\JE^C\JTZisa mapping with k{V) C C and k{E) C TZ and for all 
e £ E with i/{e) = (ui, . . . , Vk) is n{e) £ TZk, and 

• p\V ^ Q is a mapping. 

As in the case of simple concept graphs, the mapping n labels the vertices and 
edges with concepts and relation names. The mapping p assigns, to each vertex 
V, a reference p{v). If u G cr{w), we say, v belongs to the nesting referenced by 
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p{w). In order to abbreviate, we denote p(e) := (p{vi), . . . , p{vk)) for the edge 
e G E with i^{e) = {vi,. . . , Vk) and call p{e) the reference of e. 

Let us explain the formal definitions by the example given on the first page: 
The nested concept graph has five vertices that are labeled (by means of k) by 
the concept names PERSON, THINK, PICTURE, BOAT and SEA. The mapping p 
assigns the object names Pierre, #1, A, #2 and Mediterranean. The three edges are 
labeled (by means of k) by the relation names agent, patient and on. The nesting 
is described by cr which assigns the vertices labeled with BOAT and SEA to the 
vertex labeled with PICTURE. To each other vertex, 0 is assigned. 

As we can see, the syntax of nested concept graphs is only a slight generaliza- 
tion of the syntax for simple concept graphs. We just have to add the mapping a 
and some conditions. The crucial point is the interpretation of these syntactical 
constructs. Therefore, we will present a situation-based approach. 

The basic idea for a contextual situation-based semantics for nested concept 
graphs is that we want to understand nested concept graphs as situation-based 
judgements in a triadic power context family. That means, whenever objects are 
put in relation by a syntactical construct, this is considered to be a situation- 
specific judgement that is interpreted in the restricted world of a formal con- 
text. Therefore, objects names are interpreted twice as objects and as situations 
of a power context family, because the situations themselves are references of 
concepts. By this double-interpretation, we respect the double character of the 
object names in nested concept graphs. The concept names are interpreted by 
distributed concepts of order 0 and the relation names of TZk (for k = 1, . . . ,n) 
by distributed concepts of order k. By this, concept names can be interpreted 
situation-based when we consider, in each situation, the corresponding compo- 
nent. More formally: 

Definition 9. Let A := (G,C,TZ) be an alphabet, IK := (G^, M^, B, yfe)fc^o,...,n 
be a triadic power context family and & := {Sb | b G B}. We denote £(K) 
for the set of distributed concepts of order 0, and (for each k = we 

denote fHfc(K) for the set of distributed concepts o/K of order k and 9I(K) := 
Ufc=i n 'The union t := tg U U 6c U in is called a K-interpretation of 

A, if the mappings satisfy the following conditions: 

• ig'G^Go and is'-G ^ 

• 6c:C — >■ €(K) is order-preserving and 

• in-Ti- — >■ 1H(]K) is order-preserving and satisfy in(J^k) ?Ifc(IK) for all k = 

Now, it can be specified, under what conditions a concept graph is valid under 
a K-interpretation. For the concept graph in our first example, we will postulate 
for example, that the object by which the object name #2 is interpreted, really 
falls under the situative concept interpreting BOAT in the situation interpreting 
A. Thus, we have to assign BOAT to a distributed concept whose component for 
the situation interpreting A satisfies this condition. With it, we have formalized 
the sub judgment “6g(#2) falls under the concept 6c (BOOT) in the situation 
is (A) .” (see the formal definition) . For outer vertices, we must specify a universal 
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situation §„ and claim the analogical conditions for the references. The edge 
conditions can be formulated in a similar way. For the formal definition, it might 
help to look at the following schematized nested concept graph. 



k{w) : p(w) 
k{v) : p{v) 



— — «(w) : p{u) 



— 



K{t) : p{t) 



Definition 10. Let A be an alphabet and K a triadic power context family. If 
i is a 'K-interpretation of A and u G B, the triple (K, u, l) is called a context- 
interpretation of A. 

The nested concept graph 0 := (V, E, v, a, k, p) is called valid in (K, u, i), and 
(K, M, i) is called model for 0, if 

i. ) all v,w &V with v G a{w) satisfy igp{v) € Ext (t^i.s(p(w)) tctt(u)), 

ii. ) all V G E\U„6yCr( w) satisfy igp{v) G Ext (7 ts„ 

iii. ) alle € E with e'' C a(yS) for aw € V satisfy Lgp{e) G Ext {i^Lg^pi^w)) ^TZK(e)), 

iv. ) all e € E with e'^ C E\ IJiugv satisfy igp{e) G Ext (7t§^ i 7 ^/t(e)). 

The conditions i.) and ii.) generalize the vertex conditions for inner and outer 
vertices, the conditions iii.) and iv.) are edge conditions for inner and outer edges. 

To sum up, we have introduced a semantics that allows an interpretation of 
nested concept graphs in triadic power context families with situations. By this, 
the judgements are interpreted situation-specifically, and in this way, we obtain 
a contextual, situation-based semantics. 

4 Reasoning with Nested Concept Graphs 

The Standard Model and Semantical Entailment 

For reasoning with nested concept graphs, we can specify a standard model 
in which all the information that is coded in the concept graph is given on the 
context level. Before giving a precise formal definition for the standard model, we 
give an idea how the power context family for the standard model is constructed. 
Starting with a nested concept graph 0 := {V, E,i/,a, k, p) and an alphabet 
A := {G,C,TZ), we take the object names of G as objects of the context Kq. 
The sets of objects for the higher contexts are constructed by G^- As sets 
of attributes for each context we take the concept names or relation names 
with corresponding arity, respectively. 

For being able to construct all necessary situations, the conditions of the 
contexts are defined by the set of all object names of p(dom a), that means by 
all object names that appear as a reference of a complex vertex. Then, we add 
an universal condition u for outer vertices and an “empty condition” e for the 
references of non-complex vertices. 
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If we chose the right incidence relations Yfc, we can interpret the object names 
by themselves, i.e. = idg, and by i® with ig{g) = Sg or §£. The concept and 
relation names are interpreted by r® and the corresponding intensionally 
defined concept. 

Definition 11. For the nested concept graph © := (V, if, v, a, k, p) over the A := 
(Q,C,TZ), the triadic power context family := (G®, M® , B® , y^®)fc=o,...,n is 
defined by G® := Q, M® := C, G® := and M® := TZk for all k = 1, . . . ,n 
and B® := p (dom cr) U {m, e}. 

For all k = 0, . . . ,n, the incidence relations Y® are defined in such a way that 
all g G G , (gi, . . . , gk) G c G C, R G TZk with k = 1, . . . , n and b G B^\{u, e} 
satisfy 

3v,w G V : 

V G p{v) = g, k{v) <c c and p{w) = b, 

3?; G V^\U»gyCr(w) : 
p('i') = 9 k(v) <c c, 

• {{gi,---,gk),R,b) 3eGE: 

P(e) = {gi,---,9k), K{e) <n R and p{a~^{e'')) = b, 

• {{gi,---,gk),R,u) GY^ 3eGE: 

p{e) = (gi, . . . ,5fc), k(c) < 7 ?, R and cr~^{e'') = 0. 

• {g, c, s) ^ hg® and {{gi, . . . ,gk),R,e) ^ Y^ 

The K-interpretation 6® := t® U i® U r® U 6^ is defined by r® := idg; t® (g) := §g 
for all g G p(dom a) and 6® {g) := §£ for all g G t/\p(dom a); t® (c) := (c) for all 
c G C; and d^{R) '■= (R) for all RgTZ. The context-interpretation (K®,t6, r®) 
is called the standard model of<&. 

It is easy to prove that the standard model is really a model for 0 (by 
checking vertex and edge conditions). 

The standard model is a useful tool to do reasoning with nested concept 
graphs. As usual in logic, inferences of nested concept graphs over the same 
alphabet can be mathematized by semantical entailment. Let me shortly recall 
the definition: We say that the nested concept graph ©i entails (semantically) 
© 2 , if ©2 is valid in all models for ©i. Then, we denote ©i |= © 2 . This relation 
can be characterized by means of the standard model as it is done in the following 
proposition (which is proved in the appendix). 

Proposition 1. The nested concept graph ©1 entails © 2 , if and only if ©2 is 
valid in the standard model o/©i. 

That means, by translating the information given in a nested concept graph 
onto the context level, i.e. by constructing the standard model, we can easily 
treat questions of inference. For every concept graph, we decide if it is entailed 
by checking the validity in the standard model. 



• { 9 , c, b) G Tg® 

• (5. c, u) G Tg® 
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Inference Rules 

In addition to these two ways to treat inferences, we can give a characterization 
on the syntactical level, that means we can specify a sound and complete set of 
inference rules with respect to the semantical entailment. 

Definition 12. We call the nested concept graph ©2 derivable from ©1 (and 
denote ©1 h © 2 ^, */ ©2 can be derived by the following inference rules (which 
are elaborated in the appendix): 

1 . Double a vertex. 

Double a vertex v and its incident edges (several times if v occurs more 
than once in v{e) ). Double also all vertices and edges, that are nested in the 
doubled vertex. Extend the mappings k and p to the doubles. 

2. Delete an isolated, non-complex vertex. 

Delete a vertex v and restrict k and p accordingly. 

3. Double an edge. 

Double an edge e and extend the mappings v and k to the double. 

4. Delete an edge. 

Delete an edge e and restrict the mappings v and k accordingly. 

5. Exchange a concept name. 

Substitute the assignment v c for v 1 — >■ k{v) for such a concept name 
c € C for which there is a vertex w € V with k{w) <c c, p{v) = p{w), and 
p{a~^{v)) = p{<T~"^{w)) (i.e. a vertex that has the same reference and is in 
a nesting with equal reference). 

6. Exchange a relation name. 

Substitute the assignment e ^ R for e 1 — >■ «;(e) for such a relation name 
R G TZ for which there is an edge f G E with K{f) <c R, p{e) C p{f), and 
p{a~^{e'')) = p{a~^{f^)). 

7. Join vertices with equal references in the same nesting. 

Join two vertices v,w G V satisfying p{v) = p{w) and a~^{v) = a~^{w) 
into a vertex v V w with the same incident edges and references, and set 
k{v y w) = c for a c G C with k{v) <c c and k{w) <c c. The sub graphs that 
are nested in v and w are combined by juxtaposition and nested in vV w. 

8. Copy a sub concept graph into an equally referenced nesting. 

Construct a sub graph (with all complete edges and nestings) that is identical 
to the first sub graph (up to the names of vertices and edges) into a nesting 
with equal references. 



Proposition 2 (Soundness and Completeness). 

Two nested concept graph ©1 and ©2 over the same alphabet satisfy 

©1 ^ ©2 ©1 h ©2. 

The soundness of these rules can be shown easily when the rules are specified 
more formally (see appendix) . Then, the proof is analogical to the proof of sound- 
ness for simple concept graphs in [Pr98b] . An idea for the proof of completeness 
is given in the appendix. 
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On the whole, we have found a sound and complete system of inference rules 
for nested concept graphs. Thus, we have three possibilities to characterize infer- 
ences: semantical entailment, syntactical derivability and validity in the standard 
model. This shows that the semantics we proposed in Section 3 allows to develop 
a rich logical theory for nested concept graphs. 

5 Coding Nested Concept Graphs in Graphs 
and in Contexts 

By constructing the standard model of a given nested concept graph, we can 
translate the information coded in the concept graph into the context level. 
Proposition 1 makes sure that the informational content is really identical. 

There is also a way back from the context level to the graph level: For a 
given triadic power context family, we can construct a nested concept graph 
that encodes the same information. Logically speaking, this is the nested concept 
graph that entails all concept graphs being valid in the power context family. 
We will only give an idea of the construction of this so-called standard graph. 
For details, see [Pr98a]. 

We start with a triadic power context family K, a u G B and an (injective) 
mapping o : B\{u} — >■ Go that assigns an object to each condition of K. In order 
to determine the standard graph, we must specify an alphabet whose elements 
are interpreted in the power context family. The set of object names will be Gq, 
the sets of concept and relation names will be the power set of attributes of the 
corresponding contexts, i.e. we define := (Gq, iP(Mo), and 

interprete each concept and relation name by the generated intensionally defined 
distributed concept. 

For the construction of the standard graph, we proceed as follows: For each 
object g € Gq and each condition b € B, we construct the vertex (g, b) that 
is referenced with g and labeled with the concept intent of the smallest situ- 
ative concept of that has g in its extension. We can easily see that this is 
exactly the set 5^0 ■,= {rn G Mq \ {g,m,b) G Iq}- Analogically, we construct 
the edges. For each k-ary tuple {gi,...,gk) G Gk and each b £ B, we con- 
struct an edge {{gi, . . . ,gk),b). It is labeled by {gi, . . . , gk)'^» := {m G Mk \ 
{{gi, . . . , gk),m,b) G Ffc}, and its incident vertices are (gi,b), , {gk, b). Edges 
and vertices that belong to a condition b G B\{u} are nested in the vertex 
{o{b),u). In this way, we obtain a nested concept graph with nestings of maxi- 
mal depth 1. Its non-complex vertices are of the form {g, u) where g ^ o(i?\{M}). 

It can be shown that this standard graph is valid in the model we started 
from and that it has the wanted property to entail all other nested concept 
graphs being valid in (K, u, l^), i.e. that it encodes, with respect to the alphabet 
all information of the power context family K. 

With it, we have the possibility to translate information which is given on 
the context level onto the graph level and vice versa. This allows an integration 
of nested concept graphs into the framework of formal concept analysis. Thus, 
nested concept graphs can be activated for conceptual knowledge processing 
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with power context families. Knowledge systems can be constructed in which 
knowledge is coded in power context families and in which nested concept graphs 
serve for the communication about the knowledge. 

6 Appendix: Formal Definitions and Proofs 

Proof for Proposition 1 . We only show the non-trivial implication: If the 
nested concept graph 02 := {V2, E2,V2,cr2, K2, P2) is valid in the standard model 
M®i := (IK.®\ u, of 01 := {Vi, Ei, i/i, ai, ki, pi), then it is valid in every 
other model M := (K, t, A) for ®i with K := (G^, yfc)fc-o,...,n Etnd A := 

Xg U Xg U Ac U At^. 

i. ) Let be v,w € V2 with v € ct2(w). We show the vertex condition for 

02 in M, i.e. Xgp2{v) G Ext {t^\sp2{w)XcH2{v))- As 02 is valid in M®L the 
vertex condition in M®“^ and the definition of the incidence relation Yq®^ implies 
i^g^P2{v) = P2{v) G Ext = Ext {k2{v)) = {g ^ Q \ 

{g,i^2{v),p2{w)) G Fp®^} = {pi{x) I G Vi : x G crfiy), Ki(x) <c K2(x), 
Pi{y) = P2(w)}. 

Thus, for all vertices v,w G V2 with v G <J2{w) there is a pair x,y € Vi 
with X G CTi(y), pi{x) = P2{v), Ki{x) <c K2 {v) and pi{y) = P2{w). From this, 
we can conclude (with the vertex condition for 0i in M and the order pre- 
serving property of Ac) that Xgp2{v) = Xgpi{x) G Ext {TT\spi{y) (Ac«;i(x))) C 
Ext {-K\sp^(n,) {XcK,2{v))). 

ii. ) Analogically, we show the vertex condition Xgp2{v) G Ext (tts^ XcK2{v)) 

for V G V2\\J^^Y^ <J2{w). As 02 is valid in M®L we have i-g^ P2{v) = P2{v) G 
Ext 7 TS„ Lc^K2{v) = {g G g \ 3 x € Vi\U«,GVi <C K2{v) , pi{x) = 

g}- 

Thus, for every outer vertex v G V2\ UmGV there is an outer vertex x G 

^l\U»G ai(w), with Ki(x) <c K2 (v) and pi(x) = g. From it, we can conclude 
(with the vertex condition for 0i in M and the order preserving property of Ac) 
the assertion by Xgp2{v) = Xgpi{x) G Ext Xcki{x)) C Ext (tts^ XcK2{v)). 

iii. ) Let be ic G V2 and e £ E2 with C a2{w) and 12(e) G 1^2^. We show 

the edge edge condition Xgp2(e) G Ext (TTxgp^t^w) A7?,K2(e)) for 02 in M. For it, 
we conclude from the edge condition for 02 in M®i and the definition of 
and r®i that t®V2(e) = P2(e) G Ext 4'«2(e) = {(gi, ■■■,gk) \ 

((gi,---,gk),H2(e),p2(w)) G F®^} = {(gi,...,5fc) &g^ \ 3 f £ Ep pfi/) = 
(gi,...,gk), Ki(f) <n K2(e), Pi{(Ji^ (r^)) = P2(w)} = {pi(f) | / G Ep 
Ki(f) <n «^2(e), Piio’i^if'^)) = p2(w)}. Then, we proceed as for the vertex 
condition, and finally, analogically for condition iv.). □ 

Precision for Inference Rules in Definition 12 . Let 0 := (V, E,i2,a, k, p) 
and 0 ' be two nested concept graphs over the alphabet A := {Q,C, TZ). We define 
the following rules: 
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1. If ©' := (y', E', v', a', k', p') satisfies V = {V\Wy) U {Wv x {1, 2}) for a u G 

V and Wv := UmeNo ~ i^\ UmeNo U U (UmeN ^a-"^(v) 

x{l,2}) with Em '■= {e € E \ 3i = 1, . . . ,n : i^{e)\i € M} and E"" := {(e,<5) | 
eG{eGif|3i=l,...,A:: v{e)\_. = ?;}, (5 G jl,2}[®’'"l} (where [e, w] := 
{i G /c} I v{e)\i = w}); v'lf.E' — >• V' with i^'\i{e) := iy(e)\i for all 

e G E^m(^yph' |i(6)j) := (^(c)|i)j) for all (e,j) G UmeN ^ 

{1,2} and iy'\i(e,S) := i^{e)\i if z ^ [e,w] and v'\i(e,S) := (v,S(i)) if z G 
[e,z;] for all {e,5) G E""; a':V' — >■ ^{V') with a'{w) := a(w) for all w G 
dom CT\(iy„ U cr“^(v)), a'(w,j) := a{w) x {j} for all (w,j) G Wy x {1,2} 
and a'{a~^{v)) := ((j(ct“^(z;))\{v}) U {(f, 1), (z;,2)}; k':V U E' — >■ CUTZ 
with k'{x) := k{x) for all x G U if\ E„m(^y-^ and n'{x,l) := k{x) 

otherwise; p'\v\Wy = p\v\w„ and p'{w,j) = p{w) for all {w,j) G Wy x {1,2} 
then we say that ©' is derived from © by doubling the vertex v € V. 

2. If ©' = {V',E,a\v',v,K\v'uE,p\v) with V' = y\{w} for a z; G y\dom ct 
with {e G if I 3 z = 1, . . . , A: : z^(e) | . = z;} = 0, then ©' is derived from © by 
deleting the isolated, non-complex vertex v € V 

3. If ©' = (V, E', v' , a, k' , p) satisfies E' = if\{e} U {(e, 1), (e, 2)} for an e G if, 
’^'\E\{e} = '^\E\{e}, v' {c, j) = v{e) for j = 1,2, K'|yu(£;\{e}) = K|yu(£;\{e}) 
and n'{e,j) = n{e) for j = 1,2, then, we say that ©' is derived from © by 
doubling the edge e G if. 

4. If we have ©' = {V, E\{e},v\E\{e},o'i K\vuE\{e}: p) for a e G if, then we say 
that ©' is derived from © by deleting the edge e G if. 

5. If we have ©' = {V,E,v, k' , a, p) and a, v € V with zc'|y\{t,}u£; = K|y\{ti}u£;) 
and if there is a zu G y such that p{w) = p{v), p(a~^(w)) = p(cr~^(v)) and 
k(w) <c k'{v), then we say that ©' is derived from © by exchanging the 
concept name k{v). 

6. If ©' = (V. E,e, a, k' , p) with K,'\vi\E\ie\ = K|vuBUer> and if there is an 

/ c E such that ,(/) =}(t), p(J-(>')) = '» and a(/) <„ X(a), 

then we say that ©' is derived from © by exchanging the relation names 
K{e). 

7. If two vertices v,w € V satisfy p{v) = p{w) and <t”^(z;) = ct“^(zz;), and 

if ©' = {V , E' ,v' ,a' , k' , p') satisfies V' = y\{z;, zc} U {z; V zz;}, E' = E; 
all e G if and all z G {!,..., A:} satisfy fo(e)|j = z; V zc if zz(e)|j G {z>,zc} 
and otherwise ly (e)|z z'(e)|j, cr |dom t 7 \{<r”^(i;),t;,u>} o'Idom 

a'{a~^{v)) = ((t(ct“^(z;))\{z), zz>}) U {z; V zc} and a'iyWw) = cr(z;) U cr(zz;); 

UB = uB and n' {v V zc) = c for a c G C with n{v) <c c 

and k{w) <c c; p'\v\{v,w} = p|y\{D,ju} and p'{v\/w) = p{v). then we say that 
©' is derived from © by joining the vertices v and w with equal references 
in the same nesting. 

8. Let := {W, E,iy^,asj, k^,, p^) be a nested sub graph of ©. That means, 
it satisfies IT C y and F C E; the mappings zz, k and p are continua- 
tions of Vf,, Kf, and pf,; all w G domaf, satisfy cr^(zz;) = a{w) fl W and 
cr-^{W\[Jy^^asjiv)) = 0 or (Jf^(W\[jy^y^, af,{v)) = {z>} for a z; G y\iy. 
If we have ©' := (y U VP, if U F, zz U ct U k U kjj, pU pjj), and 0 = 
<T“^(iy\ cr^(z;)), then we say that ©' is derived from © by copying 
the sub graph Sj into the outer nesting of ©. 
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If there is a f G V\W with a ^(IF\ Uj;eiT = {f} and if we have 0' = 

(yuVF, E(jF,iyOvsi,a',K(jKf,,p(jpf,) with cr'|y\{^} = cr|y\{„}, a'{v) = 
(J{v)\jW\[j^^y^, af,{v), and cr'lu„6w then we say that 0' is 
derived from 0 by copying the sub graph into the nesting of v. 

Idea for the Proof of Completeness (Proposition 2). For the proof we 
consider the elementary modules of nested concept graphs: The sub graphs that 
are generated by one edge or one vertex, that means the sub graph consisting 
of all vertices in which the generating edge or vertex is successively nested. 
Those elementary modules are important for the inferences on concept graphs: 
A nested concept graph does not entail all its sub graphs because if there is an 
outer nesting missing, the sub graph is not “in the right situation” anymore. 
Only the sub graphs being generated by one vertex or edge are entailed by 0 
(because they can be derived by deleting all other edges and vertices). 

In order to prove completeness, we derive, for each edge and each vertex of 
02 the corresponding generated sub graph of 0i. That means, e.g. for each edge 
f € E 2 there is an edge e/ G E\ with pi(e/) = p 2 (/), «^i(e/) <n K 2 {f) and 

(it exists because of the edge condition for 02 in 
the standard model of 0i). This edge e/ generates a sub graph with a relation 
name Ki(e/) that can be generalized into K 2 (/)- Then, the sub graph’s inner 
edge and its reference of the inner nesting equals those of /, but not the other 
nestings and labels of vertices). 

Having derived these modules for all edges and vertices, they can be joined 
successively into a concept graph which is isomorphic to 02. For it, the modules 
are copied into the right nesting and then they are joined. Finally, all superfluous 
modules are eliminated. (For a detailed formal proof see [Pr98a]). □ 
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Abstract. The aim of this paper is to mathematically introduce nega- 
tion to concept graphs (which are a mathematical modification of concep- 
tual graphs) as a well-defined syntactical construct. First off, we discuss 
some questions which arise when negations for conceptual graphs are 
defined. In our view, a solution for these questions is to express nega- 
tions by cuts in the sense of Peirce’s theory of existential graphs. A 
set-theoretical semantics for (nonexistential) concept graphs with cuts is 
developed in the framework of contextual logic. A modification of Peirce’s 
alpha-calculus, which is sound and complete, is presented. 



1 Motivation 

Conceptual graphs are based on the existential graphs of Charles Sanders Peirce. 
These graphs consist of lines called lines of identities, predicate names of arbi- 
trary arity and ovals around subgraphs which are used to negate the enclosed 
subgraph. The following three examples are well known: 

CAT - ON- MAT [CAT- 0N~MAT) CAT 4 ON^ - MAT 

The meanings of these graphs are: ‘a cat is on a mat’, ‘no cat is on any mat’ 
and ‘there is a cat and there is a a mat such that the cat is not on the mat’. 

As Peirce says, ’’That a proposition is false is a logical statement about it, and 
therefore in a logical system deserves special treatment.” ([Pe98]). The graphi- 
cal element oval which Peirce used to negate its enclosure has been transferred 
to context boxes in conceptual graphs. These boxes are used to express that 
some information is valid in specific contexts or situations. Hence, the charac- 
ter of negation as logical operator in existental graphs changed to a metalevel 
character in conceptual graphs. Of course, in knowledge representation and nat- 
ural language, negations are unavoidable. So the feature to express negations is 
desirable in concept graphs. 

To handle negation in concept graphs, we need to achieve the following aims: 
For the mathematical treatment, formation rules for the well-formed formulas 
must exist that can express negations, and negation has to be covered by rules 
of inference. To do this in the spirit of Peirce, the semantics of negation has 
to be intelligible, and the graphical representation of negations must be easily 
readable and intuitively understandable (which has been an important goal in 
the theory of conceptual graphs from the very beginning, too). 



B. Ganter and G.W. Mineau (Eds.): ICCS 2000, LNAI 1867, pp. 263—276, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 
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Negations occur in several approaches for conceptual graphs. Why we do not 
adopt and mathematizise one of these approaches shall be explained in the rest 
of this section. 

In order to handle negations, a specific syntactical element of the well-formed 
formulas has to be declared to express them. For this purpose, the standard ap- 
proach use a context box of type Proposition which is linked to a unary relation 
of type neg ([So99]). Sometimes, these special context boxes are abbreviated by 
context boxes of type Negation (e.g. [SoOO]) or by drawing a simple rectangle 
with the mathematical negation symbol -■ ([So99]). Some approaches use these 
rectangles without declaring whether the box is a specific syntactical element or 
just an abbreviation for context boxes of a specific type (e.g. [We95]). But both 
a calculus and a translation of conceptual graphs into other formal languages 
(like the translation to first order logic with the <?-operator), have to respect the 
logical role of negation. 

So, if negation is expressed just by special context boxes, any calculus and 
any tranlation has to treat these special context boxes differently from all other 
context boxes. For example, if negation is expressed by context boxes of type 
Negation, a calculus should allow the nested boxes in the following graph to be 
erased (and vice versa, to be introduced again): 




This seems to be not possible in any calculus which does not treat the nega- 
tion boxes separately (like the calculus of Prediger ([Pr98b]) or any calculus 
which is based on projections). 

If negation is expressed with contexts of type Proposition, linked to a unary 
relation neg, another difficulty appears. This shall be shown by the following two 
conceptual graphs: 




The first graph is well known: Its meaning is ‘a cat is on a mat’. In particular, 
the graph claims to be true. The meaning of the second graph is, strictly speak- 
ing, ‘there exists a proposition, which states that a cat is on a mat’ ([So99]), 
and therefore different to the meaning of the first graph. Indeed; in none of the 
common calculuses, one graph can be derived from the other one. Hence it is 
problematic to express the negation of the first graph by the second graph with 
a relation neg. 

To summarize: It seems to be difficult to introduce negation as a special 
context box. These context boxes have to be treated differently to other boxes 
in the calculus and in any translation from conceptual graphs to other formal 
languages (like the operator ^). This yields the following conclusion: For the 
mathematical treatment of negation in concept graphs, in the definition of their 
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well-formed formulas there should be a specific syntactical element which is used 
to express negation. 

The next step has to clarify the semantics, i.e. the meaning, of negation. 
To prepare this, we will discuss a small example. Consider the true proposition 
‘the painter Rembrandt created the painting ‘the nightwatch”, which can be 
translated to the following conceptual graph: 



PAINTER: Rembrandt 


1 


'' create ^ 
V J 


1 


PAINTING: the nightwatch 



This graph represents not only the information that Rembrandt created ‘the 
nightwatch’, but also that Rembrandt is a painter and ‘the nightwatch’ is a paint- 
ing. Now, consider the painting ‘a starry night’ instead of ‘the nightwatch’. This 
painting was created by van Gogh, so the proposition ‘the painter Rembrandt 
did not create the painting ‘the starry night” is true. How can this proposition 
be transformed to a conceptual graph? The following graph is a first attempt: 




This graph is not the translation of the former proposition: In the proposition, 
only the verb ‘to create’ is negated, but in the graph, the negation box also 
encloses the information that Rembrandt is a painter and ‘the starry night’ is a 
painting. The information in the concept boxes can fail, too, as can be seen in 
the following graph: 




This graph is true although van Gogh did create ‘the starry night’. In par- 
ticular, this graph should not be read as 

The composer Van Gogh did not create the painting ‘a starry night’. 

But this understanding is suggested when Sowa in [SoOO] says, that the meaning 
of the graph [Negation : [Cat : Yoyo] — ^ (On) [Mat] ] is ‘the graph denies that 

the cat Yoyo is on a mat’. Now, the goal is to negate only the verb ‘to create’ 
in the false proposition ‘the painter Rembrandt created the painting ‘the starry 
night” . This problem has already been addressed, one approach for its solution 
is the following graph: 




Fig. 1. CG for ‘the painter Rembrandt did not create the painting ‘a starry night' 
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Indeed, this graph expresses the proposition ‘the painter Rembrandt did not 
create the painting ‘the starry night’. But obviously, the aim of making concep- 
tual graphs easily readable and intuitively understandable is not fulfilled. 

Expressing identity in conceptual graphs with coreference-links or coreference- 
sets leads to another class of difficulties. In particular, the meaning of coreference- 
links connected to a concept box within a negation is not straightforward. This 
shall be explained next: 

In conceptual graphs, coreference-links (which are used to express coreference- 
sets ([So99])) are used to express the identity of two entities: ’’Two concepts that 
refer to the same individual are coreferent. [. . . ] To show that they are corefer- 
ent, they are connected with a dotted line, called a coreference link^ ([So92]). 
Consider the following graph: 




Fig. 2. conceptual graph for 3x3y.xf=y 

According to Sowa ([SoOO]), the operator translates this graph into the first 
order logic formula 3x3y~<3z{z = x A z = y) , which is equivalent to 3x3y.x yf y. In 
particular, the three concept boxes cannot refer to the same individual. Note that 
<P assigns different variables to the generic markers of different concept boxes, 
even if they are connected with a coreference-link. These variables are explicitly 
set to be equal in the formula, and they are equated inside the negated part of 
the formula. But since the links in the graph looks symmetric, it is not clear 
to a reader why the equating in the formula is placed inside and not outside of 
the negated subformula. This ambiguity can be seen even better in the following 
example: 




If ^ translates this graph to PAINTER{R) A ^{R = VG A PAINTER{VG)), 
the resulting formula is true, but if translates this graph to PAINTER(R) A 
Rembrandt = VG A ->{PAINTER{VG)) the resulting formula is not true (the 
names in the formulas are abbreviated by R and V G) . Hence, in order to under- 
stand the right meaning of this graph, the reader must have in mind the implicit 
agreement that equality is always placed in the inner context. 

If we accept this meaning of corerence links, the next step is is to make 
clear which syntactical element in the well-formed formulas is used for them, 
and how they are handled by a calculus. In the abstract syntax of conceptual 
graphs ([So99]), coreference-links are generalized to coreference-sets. For exam- 
ple, Figure 2 has two coreference sets which are represented by coreference-links. 
Coreference-sets are sufficient to handle coreference in conceptual graphs with 
negations. But still rules are needed that treat these sets in a sound and com- 
plete way (for example, there have to be rules which allow a link to be drawn or 
erased from a concept box to itself). Some calculuses lack rules like this. 
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To summarize again: Introducing coreference sets to express identity may 
lead to misunderstandings of their meaning and to gaps in their syntactical 
implementation. 

To cope with all the mentioned problems, we suggest the following: First, 
negations should be introduced as a new syntactical element, namely the ovals 
of Peirce, which can be drawn around arbitrary parts of a conceptual graph. To 
distinguish these ovals from the ovals which are drawn around relation names, 
we propose drawing them in bold. For example, in 



PAINTER: Rembrandt 


1 


^ create ) 
V J 


\ 


PAINTING: the starry night 



only the relation create has to be negated, and in 




Because the present interpretation of coreference-links is not intuitive in some 
sense, we suggest to introduce a special binary relation id (as in first order logic). 
The advantages of this approach are 

1. id can be trated like other relations and 

2. the identity can be negated without loss of readability. 

In our view this yields a more understandable notion of identity (as understood 
in mathematics). For example, the meaning of the graph 




Fig. 3. CG with negation ovals for 3x3y.x=/y 

is ‘there exist at least two things’. Thus, it has the same meaning as the graph 
in Figure 2, but is much simpler. Furthermore, it shows that id is a proper 
syntactical extension and not a direct mathematization of coreference links. 

Since the syntactical elements which allow negations and identity are ex- 
tended and in this approach, every conceptual graph with negation boxes and 
coreference-links can be translated into a concept graph with negation ovals 
and the relation id. Of course, negation boxes (for example, context boxes of 
type Negation) are translated to negation ovals. A coreference- link between two 
context boxes is translated into a relation id between the boxes such that the 
relation node id is placed in the negation oval of the dominated context box. We 
will exemplify this with the following: The conceptual graph 
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is translated to the following conceptual graph with negation ovals and the re- 
lation id: 




Fig. 4. another CG with negation ovals for 3x3y.x^y 

This graph can be transformed to the graph in Figure 3, which has the same 
meaning, but looks much simpler. On the other hand, concept graphs with nega- 
tion ovals and the relation id can be translated to graphs with negation boxes 
and coreference-links. But because negation ovals need not include subgraphs, 
but arbitrary subsets of concept nodes and relations nodes, this translation is 
more complicated than the translation in the other direction. For example: Be- 
fore the graph in Figure 3 can be translated, it has to be transformed into the 
graph in Figure 4. 

The approach we present here is closely related to the original ideas of Peirce. 
It is easier to mathematize than approaches based on concept boxes of a specific 
type. In this paper, our approach shall be elaborated for simple concept graphs 
without generic markers, but with negation ovals and the relation id. In partic- 
ular, the syntax for these graphs is defined, an extensional semantics for these 
graphs is introduced (which is based on power context families), and a sound 
and complete calculus is presented. Furthermore, this approach allows to define 
mathematically the operator (p on simple concept graphs (which maps graphs 
to first order logic formulas) and its inverse operator W (which maps first order 
logic formulas to graphs) such that both respect the (syntactical or semantic) 
entailment relation on graphs and formulas, respectively. In particular, the ex- 
pressiveness of simple graphs and first order logic formulas is the same. This will 
be elaborated in a work which is in progress now. 



2 Basic Definitions 

Simple concept graphs are introduced by Prediger in [Pr98b] as mathematically 
defined syntactical constructs. We take into account her approach and extended 
it to include the possibily to express negations by using cuts and the possibility 
to express identity by using a special binary relation id. 

First we have to start with ordered sets of names for objects, names and 
relations. These orders represent the conceptual ontology of the domain we con- 
sider. 

Definition 1. An alphabet of conceptual graphs is a tripel A := {Q,C,TV) such 
that 
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— Q is a finite set whose elements are called object names 

~ (C, <c) is a finite ordered set with a greatest element T whose elements are 
called concept names 

~ is a union of finite ordered sets (jR-k,^nk)> k = (for an 

n G N with n>l) whose elements are called relation names. Let id € TZ2- 

Now we can define the underlying structures of concept graphs with cuts. 
This definition extends the definition of directed multi-hypergraphs given in 
[Pr 98 b] by cuts, so that negations can be expressed. 

Definition 2 . A directed multi-hypergraph with cuts (of type n) is a structure 
Cut, area) such that 

— V and E are finite sets whose elements are called vertices and edges, respec- 
tively, 

— i' : E ^ \J (for a n G N,n > 1 ) is a mapping, 

— Cut is a finite set whose elements are called cuts and 

— area : Cut -G U if U Cut) is a mapping such that c ^ area{c) for each 
c G Cut and, for two cuts ci,C2 with ci C2, exactly one of the following 
conditions holds: 

i) {ci} U area{ci) C area{c2), 
ii) {02} U area{c2) Q area{ci). 

Hi) ({ci} U area(ci)) fl {{02} U area(c2)) = 0 . 

For an edge e G E with v{e) = (wi, . . . ,Vk) we define \e\ := k and := Vi. 

For each v G V, let Ey := {e G E \ 3 i: J^(e)|^ = v}, and analogously for each 
e G E, let Ve '■= {v G V \ 3 i: J^(e)|^ = u}. If it cannot be misunderstood, we 
write e|^ instead of v{e)\^. 

The notion of cuts and areas is closely related to the ideas of Peirce, as they 
are described in the work of Roberts (see [Ro 73 ]). Peirce negated parts of an 
existential graph just by drawing an oval around it. This oval (more exactly just 
the line which is drawn on the sheet of assertion) is called a cut. In particular, a 

cut is not a graph. The space within a cut is called its close or area. So the area 

of a cut c contains vertices, edges and other cuts, even if they are deeper nested 
inside other cuts, but not the cut c itself. All the edges, vertices and cuts in the 
area of c are said to be enclosed by c. 

Cuts do not intersect each other by the definition of Peirce. So for two dif- 
ferent cuts ci,C2, exactly one of the following cases occurs: 

— Cl and its area is entirely enclosed by C2, 

— C2 and its area is entirely enclosed by ci, 

— Cl and its area and C2 and its area have nothing in common. 

Obvioulsy, these three cases coincide with the three conditions for the map- 
ping area in Definition 2 . Now, let us first mention some simple properties for 
the mapping area which can be shown easily: 

— Cl yf C2 A area(ci) = area{c2) area{c\) = area(c2) = 0 

— 0 C area(ci) C area{c2) ci G area{c2) 

— Cl G area{c2) area(ci) C area{c2) 
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In many cases it makes sense to treat the outermost context, the sheet of 
assertion, as an (additional) cut. If we abbreviate the sheet of assertion by T, 
we immediately come to the following definition: 

Definition 3. If Cut is a set of cuts of a directed multi-hypergraph with cuts and 
if area is the appropriate mapping, then let Cut^ := CutU{T} and areafT) := 
VUEU Cut. 

It is easy to see that this extension still satisfies the conditions for the map- 
ping area which are given in Definition 2. This means that the properties we 
have just shown for Cut hold for Cut^ , too. 

By Cl < C 2 ci G area{c 2 ) a canonical ordering on Cut^ , which is a tree 
with T as greatest element, is defined. This can be verified with the properties 
for the mapping area. 

Obviously, each edge and vertex is enclosed directly (and not deeper nested) 
in a uniquely given cut c. For the further work, the notion of a subgraph is needed. 
It seems to be evident that a subgraph is enclosed directly in a uniquely given cut 
c, too. The notions of being directly enclosed and subgraph shall become precise 
through the following definition: 

Definition 4. Let 0 = {V,E,v, Cut, area) be a directed multi-hypergraph with 
cuts. 

— For each k G V U E U Cut we define 

cut{k) := min{c G Cut^ \ k G area(c)} 

cut{k) is called the cut of k and cut{k) is said to enclose directly the vertex 

(the edge, the cut) k. 

— The graph & = {V' , E' , C , Cut' , area') is called a subgraph of & in the cut 

c if c G Cut^ is the smallest cut such that the following conditions hold: 

• V C V, E C E, Cut C Cut and the mappings v and area are just 

the restrictions of v and area to E' resp. Cut' (and are therefore well 
defined), 

• area(c') QV' \J E' \J Cut' for each c' G Cut' , 

• cut{k') G Cut' U {c} for each k' G V' U E^ U Cut', 

• V G V' for each edge e' G E' and every vertex v GVg. 

We write: 0' C 0 and cut{&) = c. 

Note, that for each vertex (or edge, cut or subgraph), the set of all cuts 
containing the vertex forms a chain. If the number of cuts enclosing the vertex 
is even, the edge is said to be evenly enclosed, and analogously, if the number is 
odd, the vertex is said to be oddly enclosed. More formally: 

Definition 5. Let 0 = {V,E,v, Cut, area) be a directed multi-hypergraph with 
cuts, let k be a subgraph or an element ofV U EUCut^ . Let n be the number of 
cuts which enclose k (n := |{c G Cut\c G area{c)}\). If n is even, k is said to 
be evenly enclosed, otherwise k is said to be oddly enclosed. An evenly enclosed 
cut is called positive, an oddly enclosed cut is called negative. 
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Now, the structure of simple concept graphs with cuts is derived from the 
structure of directed multi-hypergraphs with cuts by additionally labeling the 
vertices and edges with concept names and relation names, respectively, and 
by assigning a reference to each vertex. In particular all definitions concerning 
directed multi- hypergraphs with cuts can be transferred to concept graphs. So 
in the following we will deal with subgraphs of concept graphs etc. 

Definition 6. A (nonexistential) simple concept graph with cuts over the alpha- 
bet A is a structure 0 := (V, E, v, Cut, area, k, p), where 

— {V,E,v, Cut, area) is a directed multi-hypergraph with cuts 

— k: VUE^CUTZisa mapping such that k{V) C C and k{E) C TZ, and all 
e € E with v(e) = (vi, . . . ,Vk) satisfy k(c) € TZk 

— p : V ^ Q is a mapping 

It is not clear what a graph containing vertices with more than one object, 
enclosed by a cut, means, and this might lead to misunderstandings. For this 
reason, the mapping p maps vertices only to elements of Q, not to subsets of 
Q (in contrast to the definition of Prediger in [Pr98b]). Furthermore, p can be 
naturally extended to the edges: If e is an edge with v{e) = {v\, . . . ,Vk), let 
p(e) := {p(vi),...,p(vn))- 

3 Semantics 

Usually, a semantics for conceptual graphs is given by a translation of graphs into 
formulas of first order logic, hence into formulas of another syntactically given 
structure. In Prediger (cf. [Pr98a], [Pr98b]), a different approach is presented. 
There, an extensional semantics which is based on power context families as 
model structures is introduced. The motivation for this contextual semantics 
can be read in [Pr98a]. With this semantics, Prediger develops a semantical 
entailment relation between concept graphs, and a sound and complete calculus 
for this entailment relation is presented. Now this approach shall be extended to 
concept graphs with cuts. 

In concept graphs without cuts, only the conjunction of positive information 
can be expressed. For this reason it was possible for Prediger to construct for 
each concept graph a standard model in which all the information of the concept 
graph is encoded. Standard models have been an additional possibility (besides 
the entailment relation and the calculus) for doing reasoning with concept graph. 
If negations are used, one can express with concept graphs the disjunction of 
pieces of information. But disjunction of information can not be canonically 
encoded in standard models. Thus if we introduce negations to concept graphs, 
unfortunately the construction of standard models has to be dropped. 

Now, let us recall the basic definitions of Prediger. 

Definition 7. A power context family K := (Ko,...,K„) of type n (for an 
n G N) is a family of contexts := {Gk, M^, Ik) that satisfies Gk C (Gq)^ for 
each k = 1, . . . ,n. Then we write K := {Gk, Mk, Ik)k=o,...,n- The elements of the 
set 91k := Ufc=i are called relation- concepts. 
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Interpreting a concept graph in a power context family, the object names will 
be interpreted by objects, e.g. by elements of the set Gq. The concept names of 
our alphabet will be interpreted by concepts in the context Kg, and relation 
names of arity k will be interpreted by relation-concepts in the context Of 
course, every reasonable interpretation has to respect the orders on the names. 
This leads to the following definition: 

Definition 8. For an alphabet A := {Q,C,TZ) and a power eontext family K, we 
call the union A := AgUAcUAT?, of the mappings Ag:f/ — >■ Gg, Ac:C — >■ $(]Kg) 
and Xtz'.TZ — >■ 91k a ^^-interpretation of A, if Xc and Xn are order-preserving, 
Ac(T) = T, Xn(fR,k) Q for all k = and ( 31 , 32 ) G Xn(id) 

3i = 32 for all 31,32 G G hold. The tupel (K, A) is called context-interpretation 
of A or, according to classical logic, A-structure. 

Recall that we defined p{e) := {p{vi), . . . , p{vn)) for edges e with v{e) = 
(ui, . . . ,Vk). Because Xg is a mapping on the set G of object names, it can be 
naturally extended to tupels of object names. In particular we get Xg{p{e)) := 
(Ag(p(i;i)), . . .,Xg{p(vn))). 

Now we can define whether a concept graph is valid in an ^-structure. This 
is done in a canonical way: 

Definition 9. Let IK &e a power context family and let 0 be a concept graph. 
Inductively over c G Cut^ , we define IK |= 0[c] in a canonical way: 

K h 6[c] 

— Xg{p{v)) G Ext{Xc{K,{v))) for each v € V with cut{v) = c (vertex condition) 

^ ^ Ext{X'n{K{e))) for each e € E with cut{e) = c (edge condition) 

— K ^ 0[d\ for each c' £ Cut with cut(c') = c (iteration over Cut^ ) 

For K 1= 0 [T] we write K |= 0 . 

If we have two concept graphs 0i, 02 such that IK ^ 02 for each A-structure 
with IK )= 01, we write 0i |= 02. 

Intuitively, IK ^ 0[c] can be read as IK ^ ®Lrea(c)’ note that generally 
area{c) is not a subgraph of 0. Therefore this should not be understood as a 
precise definition. 



4 Calculus 

The following calculus is based on the a-calculus of Peirce for existential graphs 
without lines of identity. These existential graphs consist only of propositional 
variables and ovals and are equivalent to propositional calculus. 

For the sake of intelligibility, the whole calculus is described using common 
spoken language. Only the rules ‘erasure’, ‘iteration’, and ‘merging two vertices’ 
will be described in a mathematically precise manner to show that using full 
sentences does not imply the loss of precision. This precision is definitely nec- 
essary because there must not be any possibility for misunderstandings of the 
rules. The rule ‘iteration’ for example, says that a subgraph of a graph can be 
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copied into the same or a nested context. If this is to have a unique meaning, 
one requires a precise definition of ‘subgraph’ and ‘same or nested context’. 

First, we present the whole calculus. The first five rules of the calculus are 
the original rules of Peirce’s a-calculus. The further rules are needed to encom- 
pass the orders on the concept- and relation names, to encompass the special 
properties of the concept name T and the relation name id and to deal with the 
possibility that different vertices can have the same reference. 

Definition 10. The calculus for (nonexistential) simple concept graph with cuts 
over the alphabet A. 

— erasure 

In positive cuts, any directly enclosed edge, isolated vertex and closed sub- 
graph may be erased. 

— insertion 

In negative cuts, any directly enclosed edge, isolated vertex and closed sub- 
graph may be inserted. 

— iteration 

Let ©0 := (Vo, Eq,vo, Ko, po,Cuto) be a subgraph of & and let c < cut{<&o) 
be a cut such that c ^ Cuto. Then a copy o/©q may be inserted into c. 

— deiteration 

If ©0 is a subgraph of © which could have been inserted by rule of iteration, 
then it may be erased. 

— double negation 

Double cuts (two cuts ci, C2 with cut~^{c2) = {ci}^ may be inserted or erased. 

— isomorphism 

A graph may be substituted by an isomorphic copy of itself. 

— generalization 

For evenly enclosed vertices and edges the concept names respectively relation 
names may be generalized. 

— specialization 

For oddly enclosed vertices and edges the concept names respectively relation 
names may be specialized. 

— T-rule 

For each object name g, an isolated vertex T : g may be inserted or erased 
in arbitrary cuts. 

— merging two vertices 



Two vertices in the same cut and with the same reference may be merged. 
— reverse merging of two vertices 

A merging of two vertices may be reversed. 



For each object name g, a vertex T : g may be merged into a vertex P : g 
(i.e. T : 5 is erased and, for every edge e, e{i) = T : g is substituted by 
e{i) AP : g\). 
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rules of identity 

• reflexivity 

For arbitrary vertices v edges e with «;(e) = id, cut{e) = cut{v) and 
e|i = = V rnay he inserted or erased. 

• symmetry 

If e is an edge with K(e) = id, then e may he substituted by an edge e! 
which fulfills e'lj^ = e'j^ = and cut{e') = cut{e). 

• transitivity 

If 62 are two edges with k{ci) = K{e2) = id, cut(ei) = cut{e2) and 
ei|2 = 62 1 then edges e with /t(e) = id, cut{e) = cut(ei), e|^ = ei[^ 
and = 62 12 rnay he inserted or erased. 

• congruence 

If e is an edge with /o(e|j^) = gi, p(e|j^) = 52 and n{e) = id, then p(e|^) = 
gi may be substituted by p(e|j^) = g2- 

To see how these rules can be written down mathematically, here are the 
precise definitions for the rules ‘erasure’, ‘iteration’ and ‘merging two vertices’. 

— If © := (V, E, V, K, p, Cut) is a concept graph with the closed subgraph ©0 := 
(Vo) £'0) vq, ko, po, Cutfi) and if c is a cut with c ^ Cuto, then let ©' be the 
following graph: 

• V := Fx{l} U Vox { 2 } 

• E' := Ex{\} U Eox{ 2 } 

• iy'((e,i)) = ((vi,i), ..., (v„,i)) for (e,i) G E' and v{e) = (vi, ...,v„) 

• K'((e,t)) := K(e) and K'{{v,i)) := k{v) for all (e,z) G E', (v,i) G V' 

• p'llvji)) = p{v) for all {v,i) G V 

• Cut' ■.= Cutx{l} U Cuto X { 2 } 

• area' is defined as follows: Let c G Cut. 

for c G Cuto let area'((c, 2 )) := area{c) x{ 2 } 
for c ^ c let area'((c, 1 )) := area{c) x { 1 } 

for c > c let area'llc, 1 )) := area{c) x { 1 } U (Vq U i?o U Cuto) x { 2 } 

Then we say that ©' is derived from © by iterating the subgraph ©0 into the 
cut c. 

— If © := (V, E, V, K, p, Cut) is a concept graph with the closed subgraph ©0 := 
(Voj Eq, Vo, Ko, po, Cuto), then let ©' be the following graph: 

• V' := y\I 7 o 

• E' := E\Eo 

• E := v\e' 

• k' := k\v'vje' 

• P' ■= P\v 

• Cut' := Cut\Cuto 

• area'(c') := area{c')\y,^^,^^^, 

Then we say that ©' is derived from © by erasing the subgraph ©q. 

— If © := {V, E,e, K, p,Cut) is a concept graph with two vertices vi,V2 G V, 
then let ©' be the following graph: 

• r := l/\{ui} 

• E' := E 
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• ly' is defined as follows: For iy(e) \ . = f let i^'(e)| . = < ' 

'' 'h ^ yv2V = Vi 

• k' := k\v'ue 

• p' ■= p\v 

• Cut' = Cut 

• area'(c') := area(c') 

Then we say that 0' is derived from 0 by merging V 2 into V\ . 

These rules are sound and complete with respect to the given semantics (see 
Theorem 1). Instead of proving this theorem formally, some heuristics for the 
rules are presented. 

First note, that all the rules are in some sense dually symmetric with respect 
to positive and negative cuts. More precisely, every rule which can be applied in 
one direction in positive cuts can be applied in the opposite direction in negative 
cuts, and vice versa. So if a rule can only be applied in positive contexts, this 
rule has a counterpart for negative contexts (like erasure and insertion or like 
generalization and specialization). All other rules apply both to positive and 
negative contexts. 

The first five rules are sound and complete concerning the classical proposi- 
tional calculus. If all vertices and edges would be understood as logically inde- 
pendent propositional variables, these rules would be enough. The rules ‘gener- 
alization’, ‘specialization’ and ‘T-rule’ encompass the orders on the concept and 
relation names. Note that T is not only the greatest element of all concepts: The 
semantics for T implies that every object belongs to the extension of the concept 
T. Thus the generalization rule does not encompass all properties of the concept 
T, and the T-rule is necessary. The same is true for the relation id. In fact it 
is a congruence relation by definition. This is encompassed by the td-rules. The 
specialization rule can be derived from the other rules, but it is added to keep 
the calculus symmetric. The rules ‘merging two vertices’ and ‘reverse merging of 
two vertices’ deal with the fact that one object may be the reference for different 
vertices. With these rules it is possible to transform every concept graph into an 
equivalent graph in which no cut intersects a relation line. More precisely: 

Definition 11. A concept graph is called free of intersections, if it fulfills the 
following condition: \/e€E\/v€V : u G V). cut{v) = cut(e) 

It follows from the rules ‘merging two vertices’ and ‘reverse merging of two ver- 
tices’ that every concept graph is equivalent to a graph free of intersections. And 
these graphs are easy to read: They have a form which is closely related to the 
existential graphs without lines of identity, and the soundness and completeness 
of the first five rules concerning existential graphs can be applied now. This leads 
to the following essential theorem: 

Theorem 1 (soundness and completeness of the calculus). 

Two nonexistential, simple concept graph 0i, 02 with cuts over A satisfy 



01 F C 2 



01 1 = G 2 
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5 Future Work 

How to procede with this work is clear. First the approach has to be extended 
to include graphs with generic markers. The ^-operator for these graphs has to 
be elaborated and it has to be proven that simple concept graphs with negation 
ovals and identity are equivalent to first order logic. In part, this has already 
been done (e.g. [BMT98]). Afterwards, the approach should be extended to the 
nested case. It seems reasonable that nested graphs are equivalent to a certain 
class of formulas of modal logic in such a way that nestings will be interpreted as 
different possible worlds, which are connected by the structure of the nestings. 
And again, a semantics and a sound and complete calculus have to be developed. 

References 

BMT98. F. Baader, R. Molitur, S. Tobies: The Guarded Fragment of Conceptual 
Graphs. RWTH LTCS-Report. 

http : //www-lti . inf ormatik.rwth-aachen. de/Forschung/Papers .html 
CMS98. M. Chein, M.-L. Mugnier, G. Simonet: Nested Graphs: A Graph-based 
Knowledge Representation Model with FOL Semantics, Rapport de 
Recherche, LIRMM, Universite Montpellier II, 1998. 

GW99a. B. Ganter, R. Wille: Formal Concept Analysis: Mathematical Foundations. 

Springer, Berlin-Heidelberg-New York 1999. 

GW99b. B. Ganter, R. Wille: Contextual attribute logic, in: W. Tepfenhart, W. Gyre 
(Eds.): Conceptual Structures: Standards and Practices, Springer Verlag, 
Berlin-New York 1999, 377-388. 

LK96. D. Lukose, R. Kremer: Knowledge Engineering: PART A, Knowledge Rep- 
resentation. http: //www. cpsc .ucalgary . ca/~kremer/ courses/CG/ 

Pe98. C. S. Peirce: Reasoning and the Logic of Things. The Cambridge Conferences 
Lectures of 1898. Ed. by K. L. Kremer, Harvard Univ. Press, Cambridge 1992 
Pr98a. S. Prediger: Kontextuelle Urteilslogik mit Begriffsgraphen. Ein Beitrag zur 
Restrukturierung der mathematischen Logik, Shaker Verlag 1998. 

Pr98b. S. Prediger: Simple Concept Graphs: A Logic Approach, in: M, -L. Mugnier, 
M. Chein (Eds.): Conceptual Structures: Theory, Tools and Applications, 
Springer Verlag, Berlin-New York 1998, 225-239. 

Ro73. D. D. Roberts: The Existential Graphs of Charles Sanders Peirce, Mouton 
The Hague - Paris 1973. 

So84. J. F. Sowa: Conceptual Structures: Information Processing in Mind and 
Machine. Addison Wesley Publishing Company Reading, 1984. 

So92. J. F. Sowa: Conceptual Graphs Summary, in: T. E. Nagle, J. A. Nagle, 
L. L. Gerholz, P. W. Eklund (Eds.): Conceptual Structures: current research 
and practice, Ellis Horwood, 1992, 3-51. 

So99. J. F. Sowa: Conceptual Graphs: Draft Proposed American National Stan- 
dard, in: W. Tepfenhart, W. Gyre (Eds.): Conceptual Structures: Standards 
and Practices, Springer Verlag, Berlin-New York 1999, 1-65. 

SoOO. J. F. Sowa: Knowledge Representation: Logical, Philosophical, and Compu- 
tational Foundations. Brooks Cole Publishing Co., Pacific Grove, CA, 2000. 
We95. M. Wermelinger: Conceptual Graphs and First-Order Logic, in: G. Ellis et al. 

(Eds.): Conceptual Structures: Applications, Implementations and Theory, 
Springer Verlag, Berlin-New York 1995, 323-337. 




Extending the CG Model by Simulations 



Jean- Frangois Baget 

LIRMM (CNRS and Universite Montpellier II) 
161, rue Ada, 34392 Montpellier - France 
bagetSlirmm . f r 



Abstract. Conceptual graphs (CGs) share with FOL a fundamental ex- 
pressiveness limitation: only higher-order logics allow assertions of prop- 
erties on predicates. This paper intends to push back this limit by reify- 
ing underlying relations of CGs (is-a, a-kind-of, referent) into first-class 
objects (i.e. nodes) of an equivalent, labelled graphs (LG) model. 
Benefits of this reification, applied on a subset of CGs, namely simple 
graphs and rules of form “if G then H ” , are discussed in terms of expres- 
siveness, succintness and robustness. We show that using the LG model 
as an interpreter allows us to improve and extend the results in [2]. 



1 Introduction 

Labelled Graphs have long been used as a natural and readable way to represent 
symbolic knowledge. The Conceptual Graphs model [11] can be seen as a higher 
level abstraction of Representation Networks [9,6], benefiting from further devel- 
opments in Knowledge Representation: a clear distinction between entities and 
relations, and between factual knowledge and background knowledge (Fig. 1). 

Though the CG model adds structuration and clarity to the represented 
knowledge, this improvement has a subtle trade-off: the relations s (for subset 
of a kind of) and e (for element of is a), explicit in representation networks, 
become implicit in CGs. Indeed, a kind of is encoded in the type hierarchies 
defining background knowledge, and is a is encoded in concept node labels. 

The drawback is that CGs cannot handle these relations as “first-class ob- 
jects”, they are used in the deduction mechanism, but cannot be the object of 
reasonings. From a FOL point of view, it is possible to assert properties on ob- 
jects, but not on relations (predicates) between objects. Some consequences are 
pointed out in [2] , in a CG model restricted to simple graphs (SGs) and “if . . . 
then” rules (SC rules). Namely: 

1. It is not possible to express that “If two concept nodes have the same indi- 
vidual marker, they represent the same entity”. Instead, [2] uses one rule for 
each individual marker in the support (i.e. for each individual marker m, “if 
two concept nodes are marked by m, then they represent the same entity”). 

2. It is not possible to define the relation type equivalence as a subtype of re- 
flexive, SYMMETRICAL and TRANSITIVE, such that these properties (them- 
selves encoded in rules) are inherited by equivalence and all its subtypes. 
Instead, [2] defines a family of rules expressing the equivalence property, that 
must be implemented for every equivalence relation. 
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Fig. 1. Representation Networks and Conceptual Graphs Formalisms 



These solutions are not satisfying in terms of succintness (families of rules 
defined for each marker and/or type) neither in terms of robustness (when a new 
type or marker is added, it is necessary to add new rules). Instead, we propose 
to reify implicit relations of the SG model (iS-A, A-kind-OF, referent) into 
first-class objects of a model that simulates SGs. 

In Sect. 2 we present a general framework for simulating a reasoning model 
into another one. This framework will then be applied for simulating the {SG, 
rules} model (recalled in Sect. 3) into a (labelled graphs, rules} model. Sect. 4 
and 5 study this latter model, its semantics, and its expressiveness with respect 
to (SG, rules}. In Sect. 6, we build step by step a simulation of the (SG, rules} 
model into the (labelled graphs, rules} model, where implicit notions of the 
simulated model are reified. Finally, in Sect. 7, we discuss our gains in terms of 
expressiveness and present an extension of the (SG, rules} model, obtained from 
this simulation. 

2 A Simulation Framework for Reasoning Models 

In this paper, we study how a reasoning model can gain expressiveness by a 
simulation into another one. As [7], we define a reasoning model by: a language 
or syntax that specifies the objects we manipulate; a deduction system that 
computes the relation 1= between objects; and valuation rules or semantics that 
are given here through a translation into FOL formulas. To compare different 
knowledge representation formalisms, [5] gives several evaluation criteria: 

1. Does the formalism support efficient reasonings? 

2. How expressive is it? 

3. How succinctly can the formalism express the sets of models that it can? 

4. How does the knowledge representation form fare in the face of change? 

Gomparison of reasoning models in terms of complexity or decidability of the 
deduction problem (1.) is only slightly discussed in this paper. To compare the 
expressiveness of two reasoning models (2.), we define the notion of simulation of 
a reasoning model into another one, which is basically a reduction of a decision 
problem to another. We propose here some criteria to evaluate the very subjective 
notion of a good simulation, and discuss in the final part of this paper how 
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a simulation that reifies implicit notions of a model can add expressiveness, 
succintness (3.) and robustness (4.) to this model. 

Definition (Simulation). A simulation of a reasoning model Ai into a model 
M is a mapping F from the objects of Ai into the objects of Af such that A can 
be deduced from B in Ai if and only if F{A) can be deduced from F{B) in Af. 



2.1 Complexity of F: Importance of a Linear Simulation 

We want a simulation to prove that all reasonings in the model A4 can he per- 
formed in Af, and wish to forbid part of the deduction mechanism in AA to be 
encoded in the mapping F. The usual way to assert that A4 is “simpler” than 
Af is to find a polynomial time simulation of A4 into Af. But a polynomial time 
mapping could be complex enough to encode in itself mechanisms of AA that 
cannot be represented in Af (Sect. 6.1). We want a simulation to compute a syn- 
tactic translation only. That is why we are interested in simulations that fulfill 
a much stronger constraint: linear time mappings. If there exists a polynomial 
time (resp. linear time) simulation of AA into Af, we say that Af is a generaliza- 
tion (resp. strong generalization) of AA. Reasoning models that generalize each 
other are said equivalent (resp. strongly equivalent). 

2.2 Backward Translation of Objects and Proofs 

It is important, for every object A in Af, to compute efficiently if A G F{AA) (i.e. 
if A is the representation of an object of AA) and, in that case, to translate back A 
into the unique object A' of AA such that F{A') = A; thus F should be injective. 
This feature allows to develop a reasoning model on top of an existing one by 
implementing only the simulation F and its backward translation in such a way 
that this translation is invisible to the end user: he only needs to manipulate 
objects of the model AA. The interest of backward translation of objects extends 
to backward translation of proofs. Intuitively, if A and B are objects of Ad, a 
proof oi Af B in the model Ad can be seen as a sequence of operations allowed 
by the deduction system of Ad. Consider a proof F{A) \= F{B) of the model Af. 
A proof A \= B of AA can be built if there exists a surjective mapping of the 
proofs in F{AA) into the proofs in Ad. This property would allow an end user to 
visualize all proofs in the model Ad, even if reasonings are interpreted into the 
model Af. Even more, should we want an application that compute deduction in 
Ad with some user interaction features (step by step visualization of proof, user 
assisted deduction), this property should extend to subsequences of a proof. 

2.3 Preservation of Complexity Classes 

Let Cm and C_y be the complexity classes of the deduction problems in Ad, 
and Af; let Am and Am be algorithms solving these problems (“respecting” the 
complexity classes Cm and Cjq); and let T be a simulation of Ad into N. We 
want to keep reasonings in the model Ad as efficient as possible, even when they 
are interpreted through the translation in the model Af. That is why Am, applied 
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to objects of r{M), should satisfy Cm- Note that, if it is not the case, Am can 
still be used, assuming that we have an efficient reconnaissance algorithm for 
r{M), or can even give new insights for a general improvement oi Am- This 
discussion could also be extended to the preservation of complexity classes for 
particular subsets of M. 

In this paper, we propose a linear simulation 0 from the {SG, rules} model 
into a “low-level” model. Since there also exists a linear simulation from the 
latter model into {SG, rules}, both models are strongly equivalent. This sim- 
ulation ensures backward translations of objects as well as proofs, and, since 
the two models are strongly equivalent, complexity classes of deduction prob- 
lems are preserved. The simulation 0 reifies implicit notions of the GG model 
(types, markers) into first-class objects of the “low-level” model. Benefits of this 
reification will be further discussed in the last section (Sect. 7) of this paper. 

3 The Simple Graphs and {SG, Rules} Models 

Here we recall fundamental definitions and results on a subset of GGs: simple 
graphs (SGs) [4], and simple graphs rules [8,2]. 

3.1 Simple Graphs 

Syntax. Basic ontological knowledge is encoded in a structure called a support. 
A support S = {Tc,Tn,I) is given by two finite partially ordered sets Tq and 
Tr, respectively the set of concept types and the set of binary^ relation types, and 
a set of individual markers I. Tq (resp. Tr) admits a greatest element, denoted 
by T c (resp. T r). A partial order is defined on I U {*}: the generic marker * 
is the greatest element and elements of I are pairwise non comparable. 

We denote by G = (Vc, Tr, A, label, co-ref) a simple graph defined on a 
support, where Vc and Vr are respectively the sets of concept nodes and relation 
nodes] E is the set of edges (the two edges incident to a relation node are labelled 
by 1 and 2); co-ref, the co-reference relation, is an equivalence relation on the set 
of generic concept nodes. Two concept nodes are said co-identical if they have 
the same individual marker or if they belong to the same co-reference class. 
Intuitively, it means that these two concept nodes represent the same entity. In 
what follows, we impose that co-identical concept nodes have the same type. 

Deduction System. Let 5 be a support, and G and H be two SGs defined 
on S. A projection U from H into G is a mapping from Vc{H) into Vc{G) and 
from Vr{H) into Vr(G) that preserves edges and their labelling, may restrict 
labels of concept and relation nodes, and preserves co-identity: 

1. Ve = (xi, X 2 ) G E{H),e' = (iT(xi), iT(x 2 )) G E{G) and label(e) = label(e') 

2. Vx G Vc{H) U VrIh), n{x) = x' ^ label(x') < label(x) 

3. Vx,y G Vc{H),co-rei{x,y) co-ident(iT(x), il(?/)) 

^ For the sake of simplicity, we restrict these definitions to binary relation types but 
all results could easily be extended to n-ary relation types. 
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A SG is said in normal form if all co-identity classes are restricted to the 
trivial ones (co-ident(a:, y) ^ x = y). The normal form NF(G) of a graph G is 
obtained by fusionning all nodes that belong to the same co-identity class. 

We consider two deduction systems in the SG model (effectively defining two 
reasoning models): we note 5, G 1= iJ if there exists a projection from H into G, 
and S,G \=nf H if there exists a projection of H into NF{G) (remark that if 
5, G N i?, then S, G \=nf H, but the reverse does not hold). 

Semantics. The deduction system \=nf is sound and complete w. r. t. the FOL 
semantics <P [4], i. e. S,G \=nf H iff <P{S),<P{G) N <P{F[), while the deduction 
system N is sound and complete w. r. t. the FOL semantics F (an alternative to 
<P proposed in [10]). A formal presentation of these semantics can be found in 
[10], while [2] presents their consequences on the different reasonings allowed. 

3.2 Simple Graph Rules 

Simple graph rules (SG rules) can be seen as graphs that embody knowledge of 
the form “if hypothesis can be deduced from a graph, so can conclusion’’'. 

Syntax. Let 5 be a support. A simple graph rule is given by a simple graph R 
defined on S and a mapping color: V{R) — >■ {0, 1}. The subgraph of R generated 
by 0-colored (resp. 1-colored) nodes is called the hypothesis (resp. conclusion) 
of the rule. Furthermore, the following constraints ensure that the application 
of a rule on a SG always generates a SG: 1) the subgraph of R generated by 0- 
colored nodes is a SG (but without restrictions on co-ref); 2) co-identity classes 
whose members are of different colors are forbidden; 3) co-identity classes whose 
members are 0-colored nodes are without restriction; 4) co-identity classes whose 
members are 1-colored nodes suffer the same restrictions as for SGs. 

Deduction System. Let 5 be a support, G be a SG defined on S, and R 
be a SG rule defined on S. We say that R is applicable to G iff there exists a 
projection from the hypothesis of R into G. Let II be such a projection. An 
immediate derivation of G by the application of R following II is a, graph G' 
obtained by 1) making the disjoint union of G and a copy of the conclusion of R; 
2) for each edge (r, c) labelled n in R, where c is a concept node of color 0 and r 
is a relation node of color 1, linking the copy of r to II (c) by an edge labelled n. 

Let 7^ be a set of SG rules defined on S. We say that G, R derives a SG G' if 
there exists a sequence of immediate derivations leading to G' by application of 
rules in IZ. We say that G, IZ normally derives G' if each immediate derivation 
is followed by a normalization of the obtained graph. 

Then again, we obtain two deduction systems in the {SG, rules} model. We 
note S ,IZ,G \= H ii there exists a graph G' such that G, IZ derives G' and H can 
be projected into G'. We note S,IZ,G \=nf H if there exists a graph G' such 
that NF{G),IZ normally derives G' and H can be projected into G'. 

Semantics. The FOL semantics and F are extended to SG rules: \=nf is 
sound and complete w. r. t. [8], and N is sound and complete w. r. t. F [2]. 
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4 The Labelled Graphs Model 

The labelled graphs (LG) model is very similar to the SG model, but does not 
take into account the notions of support or type: we consider it as a “syntactically 
neutral” version of SGs. We first define formally the objects we manipulate, then 
propose two deduction mechanisms based on projection whose semantics mimic 
the ones defined in the SG model. It is not surprising, this model being simpler, 
that there exists a linear simulation of labelled graphs into SGs. 

4.1 Syntax and Deduction Mechanisms 

Let .4 be a set of symbols, and * ^ A& special symbol named the generic label. A 
labelled graph (LG) G = (V, E, label) is given by a set of nodes V, a symmetrical 
relation E on V x V, and a mapping label from V into A U {*}. 

Let E[ and G be two LGs, a LG-projection (projection in the LG model) is a 
mapping U from V {H) into V (G) that preserves edges and non-generic labels: 

1. V{xuX2) G E{H),{n{xi),n{x2)) G E{G) 

2. Wx G V{E[),label{x) label(i7(a;)) = label(a;) 

A LG is said in normal form if all non-generic labels are different. The normal 
form NE{G) of a LG G is obtained by fusionning nodes having the same, non- 
generic label. We note iJ IZ G if there exists a LG-projection from H into G. 

We note G 1= iJ if there exists a LG-projection from H into G, and note 
G \=nf H if there exists a LG-projection from E[ into NF{G). Unless otherwise 
noted, we will consider N when referring to the LG model. 

4.2 Linear Simulations of Labelled Graphs into SGs 

Lemma 1. The {SGs,\=) model is a strong generalization of the (LGs,\=) model 
and the {SGs,\=nf) model is a strong generalization of the (LGs,\=nf) model. 

Proof. This result is not surprising, but the simulations E of (LGs, N) into 
(SGs,N) and E^f of (LGs, l=Ari;’) into (SGs,\=nf) will be used in Theor.l. 

The simulation T is illustrated in Fig. 2 by the transformation of the LG 
Gi into the SG G 2 . We define a support r{A) as follows: * is associated to 
the greatest element of Tq, each label of A is associated to a distinct type of 
Tc (denoted by the same symbol). All types corresponding to A are pairwise 
non-comparable. Tr is restricted to one relation type, named link. X is empty. 
Each node of an LG is transformed into a concept node whose type is the one 
associated with its label and whose marker is generic. Each edge ah is transformed 
into two symmetrical relation nodes typed link, linking E{a) and E{b). 

E]\[p is illustrated in Fig. 2 by the transformation Gi into G 3 . The only 
concept type in Enp{A) is node, Tp is restricted to the relation type link, and 
I is equal to A. The difference with E consists in concept node labeling: a node 
labelled m is transformed into a concept node whose type is node, and whose 
marker is generic if m is generic, or the element of X associated to m otherwise. 

We now check that T and Effp are linear transformations, that G 1= iL iff 
E(^A), E(^G) N T(iJ), and that G \^nf H iff Ep/pi^A'), EjppifG) \^nf Gp[p{E[). □ 
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Fig. 2. Simulations of LG models in SGs 



4.3 Semantics 

Semantics <j) and f/' for the two labelled graph models are defined as: 

- ^{A) = ^{r{A)) and V'(G) = <?(fo(G)) 

- <P{A) = <P{rNF{A)) and (/.(G) = <!>{rMF{G)) 

Theorem 1 (Soundness and completeness). The deduction system 1= (resp. 
\=nf) in the LG model is sound and complete w. r. t. the semantics ip (resp. (j>). 

Proof. This is a corollary of Lem. 1, since P only creates SGs in normal form. □ 



5 The {Labelled Graphs, Rules} Model 

Labelled graph rules (LG rules) are designed as a “light syntax” version of SG 
rules. Extending the results in Sect. 4, we simulate the {LG, rules} model into 
the {SG, rules} model, and give sound and complete semantics. 

5.1 Syntax and Deduction Mechanisms 

A LG rule R = (V, E, label, color), is given by a LG (V, E, label) and a mapping 
color from V into {0, 1}. The subgraph hyp(i?) (resp. conc(i?)) generated by 0- 
colored (resp. 1 -colored) nodes, circled in white (resp. gray) in the representation 
of the graph, is called the hypothesis (resp. conclusion) of the rule (Fig. 3). 

A LG rule R is applicable to a LG G if there exists a LG-projection (say LI) 
from the hypothesis of R into G. In that case, a LG G' is an immediate derivation 
of G following R and II if G' is obtained by making the disjoint union of G and 
a copy of conc(i?), then, for each edge he G E{R) such that h £ hyp(R) and 
c G conc(i?), adding an edge between E(h) and copy(c). 

The mechanism of rule application is illustrated in Fig. 3, where Gi, . . . G 5 
are the immediate derivations of G following R. The notions of derivation and 
normal derivation are defined as in Sect. 3.2, as a sequence of immediate deriva- 
tions (each one followed by a normalization in the latter case). We note IZ,G \= H 
if there exists a graph G' such that G, IZ derives G' and there exists a LG pro- 
jection of H into G'. We note IZ,G \=nf H if there exists a graph G' such that 
NF{G),IZ normally derives G' and there exists a LG projection of H into G'. 
Unless otherwise noted, we will use 1= when considering the {LG, rules} model. 
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Fig. 3. Five immediate derivations of a graph G 



5.2 Linear Simulations of Labelled Graphs Rules into SG Rules 

Lemma 2. The {{SGs, rules}, \=) (resp. {{SGs, rules} ,\= n f) ) model is a strong 
generalization of the {{LGs, rules}, \=) (resp. {{LGs, rules}, \=np)) model. 

Proof. As in the proof of lemma 1, we exhibit two simulations A and Apfp. Let 
R = (G, color) be a LG rule. A{R) (resp. Anp{R)) is obtained by adding to 
r(G) (resp. /atf(G)) the following coloration: the color 0 is assigned to nodes 
issued from the hypothesis of TZ, the color 1 to all others. We can verify that 
these simulations are linear, that, thanks to lemma 1, a LG rule R is applicable 
to a labelled graph G iff A{R) (resp. Affp{R)) is applicable to P{G) (resp. 
Rnf{G)), and that, to every immediate derivation of a LG G into G' following 
R corresponds an immediate derivation of P{G) into G(G') following A{R). □ 



5.3 Semantics 

We extend the semantics (f and if as in Sect. 4. Let i? be a LG rule: 

- if{R) = d>{A{R)) 

- <f{R) = <P{Anf{R)) 

Theorem 2 (Soundness and completeness). The deduction system \= (resp. 
\=nf) m the {LG, rules} model is sound and complete w. r. t. if (resp. (f). 



Proof. Lemma 2, soundness and completeness for the {SG, rules} models. □ 



6 Simulations of {SG, Rules} into {LG, Rules} 

Having defined the LG and (LG, rules} models as particular cases of SG and 
(SG, rules} (Lem. 1 and 2), we now simulate various SG models (beginning 
with a basic one) into {LG, rules} models by reifying implicit relations of SGs. 
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6.1 Basic Simple Graphs 

We call basic simple graph (BSG) a SG defined on a support S without co- 
reference links (all co-reference classes are trivial), nor individual markers (I = 
0). BSGs being in normal form, the and 'f' semantics are equivalent. 

Proposition 1. The BSG model is equivalent to the LG model. 

Proof. We have proven (lemma 1) that the BSG model is a strong generalization 
of the LG model, since P produces only BSGs. We now exhibit a polynomial 
simulation of the BSG model into the LG model. Fig. 4 represents the simulation, 
by the mapping 6>i, of the graph in Fig. 1 (without its individual markers). 




CONCEPT 1 RELATION 2 CONCEPT 1 RELATION 2 CONCEPT 

I I . _ I I 

IS-A Man agent — IS-A Owning IS-A ^ Automobile IS-A 



Fig. 4. Simulation of the BSG model into the LG model via &i 



We first define skel(G), a LG called the skeleton of G (boldface in Fig. 4), 
obtained by creating a node skel(c) labelled by concept (resp. relation) for 
each concept node (resp. relation node) x in G, and, for each edge re G G, a 
node labelled by the label of this edge, whose neighbors are skel(r) and skel(c). 

The LG 0i{G,S) is then obtained from skel(G), by adding a node labelled 
by t for each type t G Tc UTr (there must be no identical symbol in these two 
sets), then for each node x typed t in G, for each type t' in Tq U Tr such that 
t < t' , adding a node labelled IS- A linking skel(x) to the node labelled by t' . We 
denote by 0f{G) a subgraph of 0i where nodes are only linked by IS-A to their 
most specific type (the subgraph in the white rectangle of Fig. 4). 

We check that 0i is a polynomial simulation. For computational efficiency, we 
point out that, G and H being two BSGs defined on S: S,G \= H iff 0i(G, 5) 1= 
0f{H) (leading to a smaller graph to project). □ 

We doubt it is possible to find a linear simulation of the BSG model into 
the LG model. Intuitively, reasonings by LG-projection and reasonings on the 
hierarchy of types have a fundamentally different nature: one is concerned with 
existence of objects, the other embodies knowledge on all types that verify some 
property. Reasonings on a hierarchy are basically rules. The following simulations 
will be given by a linear mapping translating only syntactic information of a SG, 
and by a constant set of rules (considered as a library for the LG interpreter) 
that encode reasonings that cannot be achieved by mere LG-projection. 

Proposition 2. The {LG, rules} model is a strong generalization of the BSG 
model. 
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Transitivity of the partial orders , Bootstraping for reflexivity (1) ‘ [ Bootstraping for reflexivity (2) 




inheritance 



Reflexivity of partial orders 



Fig. 6. Library TZa of LG-mles used for the transformation 02 



Proof. Let G be a BSG defined on S. We first encode the support in a LG 
enc(5). This graph contains two particular nodes, labelled top-CONCEPT and 
TOP-RELATION, that represent the types T c and T Then, for each type t being 
a direct subtype of types ti, ... ,tk, we add a new node labelled t that is linked 
by a chain - 1 - A-kind-OF - 2 - to the representation of each of the ti. The 
LG 6>2(G, 5) is obtained by the disjoint union of skel(G) (see proof of Prop. 1) 
and enc(5), then, for each node labelled concept or relation in this skeleton, 
linking it by a chain^ - IS-A - to the node labelled by its most specific type. 

The graph in Fig. 5 illustrates this transformation: note that, this time, 
transformation of the support has been purely syntactic. No information on the 
properties of IS-A or A-kind-OF has been needed to generate this graph. These 
properties are given by the rules TZs in Fig- 6. These rules express the transitivity 
and reflexivity of the partial order on types, and that an entity or relation inherits 
all the super- types of its given type. Though only this last rule is necessary to 
prove the proposition, they will all be used when simulating co-reference. 

We must now check that 6*2 is linear and verify the following equivalences 
(the last one being presented in an optimization perspective): 5, G N iL iff 
02 (G, 5) , N 02 (Ff, 5) iff 02 (G, 5) , N Of (H) □ 



As for a-kind-of, we could have used a chain - 1 - IS-A - 2 -. This is not necessary 
here, since the “orientation” is implicitly given, from the object to its type. 



2 




Extending the CG Model by Simulations 287 



Remark 1. All rules presented here require there is a unique node representing 
each type (same for individual markers) . All LG rules to be applied on SGs must 
be designed in such a way that the graphs they generate keep this invariant true. 



6.2 Introducing Co-reference and Individual Markers 

Proposition 3. The {LG, rules} model is a strong generalization of the (SG, 
Nj and the (SG, \=nf) models. 

Proof. Let G be a SG on a support S. As in [2], reification of co-identity is done 
by simulation into the SG model itself. 9{S) is the support obtained by adding to 
S four new relation types, namely reflexive, symmetrical, transitive and 
EQUIVALENCE, the first being subtypes of T r, the latter a subtype of the three 
others, then adding CO-ident as a subtype of equivalence, and CO-REF as a 
subtype of CO-ident. 9{G) is obtained by adding a relation node typed co-ref 
between nodes in the same co-reference class (this simulation is linear, we only 
need to generate n — 1 nodes for a co-reference class of size n) . It remains now 
to handle individual markers. The mapping O 3 is an extension of 6*2- The graph 
obtained with O 3 contains one node labelled m for each individual marker m in 
T (required in Sect. 6.3), all linked to a unique node labelled markers. Each 
node labelled concept obtained from an individual node marked m is linked 
by a node labelled referent to the node representing m. In , only the most 
specific types and the individual markers present in the graph are represented. 




Fig. 7. Library TZc of LG-rules defining the co-identity relation 



Fig. 7 defines the set of rules TZc'. the three first rules ensure that, for every 
relation type declared as a subtype of equivalence (and in particular CO-REF 
and CO-ident), this relation behaves as an equivalence relation. The last one 
indicates that concept nodes having the same marker are co-ident. These rules are 
sufficient to simulate the (SG, N) model, i.e.: 5, G N iL iff 03 {O{G), 9{S)), TZg U 
TZc 03 (proof of this assertion is given by Prop. 2, pointing out that 

these rules are a “higher-order version” of the ones presented in [2]). 
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Fig. 8. Library TZn of LG-rules handling the normalization process 



To simulate the normalization step for deduction in the (SG, \=nf) model, 
we add another set of rules, TZn (Fig. 8). Again, thanks to Prop. 2, by verifying 
that these rules are a “higher-order version” of the ones presented in [2], we 
prove that 5, G \=nf H iff 03 { 6 {G), 6 {S)),TZs U T^c U TZn F 0^{6{H)). □ 

6.3 Introducing Rules 

Theorem 3. The {LG, rules}, ({SG, rules}, Nj and ({SG, rules}, \=nf) models 
are strongly equivalent. 

Proof. Sect. 5 proves one part of the equivalence, we must now prove that the 
{LG, rules} model is a strong generalization of both the ({SG, rules}, N) and the 
({SG, rules}, \=nf) models. Let T{R) be a linear time mapping from SG rules 
into LG rules. Let R = {G, color) be a SG rule defined on S. T{R) is obtained 
from 6*3 (d(G)) (the simulation of a SG into a LG) by assigning the color 0 to all 
nodes defining the support and to nodes representing the hypothesis of R, and 
the color 1 to all others (Fig. 9). This translation is designed in such a way that 
the “unique type and marker” invariant (see Rem. 1) is preserved by derivation. 
Let 5 be a support, G and H be two SGs defined on S, we prove that: 

1. S,TZ,G\=H iff 03{6{G),e{S)),r{TZ) UTZsUTZ^^ Of {6(H)) 

2. S, TZ, G\=nf H iS 03{9{G),e{s)),r{TZ) U U U N 0f{6{H)) 




Fig. 9. A SG rule and the obtained transformation by T 



Gheck that the application of a SG rule i? on a SG G gives a SG G' iff the 
simulation of G derives the simulation of G' by some applications of the LG 
rules TZs^TTZc, then an application of T(i?). □ 

Corollary 1. The {Nested Graphs, rules} model is strongly equivalent to the 
{LG, rules} model. 

Proof. [2] simulates of nested graphs into SGs, using a set Af of SG rules. We 
compose this linear simulation with 6*3, and translate Af hy T. □ 
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7 Conclusion 

In this paper, we have explored a basic projection-based reasoning model, the 
LG model, a priori less expressive than the SG model. However, we have proven 
that the {LG, rules} model is strongly equivalent to the (SG, rules} and even 
the {Nested graphs, rules} models. The simulation 0 exhibited to translate an 
instance of {SG, rules} deduction problem into an instance of the {LG, rules} 
deduction problem possess all “good properties” discussed in Sect. 2: 

1. 6* is linear; 

2. We prove, though only hints are included in this paper, there exists a linear 
(resp. polynomial) backward translation of objects (resp. proofs) of the {LG, 
rules} model into objects (resp. proofs) of the {SG, rules} model [3]; 

3. Since the two models are strongly equivalent, they are both in the same 
complexity class (i.e. deduction problems are semi- decidable in both models). 

Let be a reasoning model: the {LG, rules} model can be seen as an 
interpreter for M if it is provided with a simulation of M , backward translations 
mechanisms, and a library of LG-rules that mimics specific reasonings of M. 
But, having built an interpreter for the {SG, rules} and {NG, rules} models 
by designing a single, easier to implement model, what are the benefits of this 
simulation? We first discuss its added expressiveness, in terms of succintness and 
robustness, then show how the simulation can be used to extend the model. 

7.1 The Succintness and Robustness Criteria 

Let us first point out the “rules factorization” gained by the simulation. Though 
[2] gives SG rules for the same result, it requires: 3 x k rules to indicate that 
ri, . . . are equivalence relations; \X\ rules to indicate that nodes sharing an 
individual marker are co-identical; 2 x |T/j| rules to simulate normalization. 

Using our LG rules library, that use types and markers as “first-class objects” 
of the model, we need only a constant number of rules (11), before declaring 
ri, . . . , rfc as subtypes of equivalence. 

But there is another benefit: using the rules defined in [2], one must add new 
rules each time a new type or a new marker is added to the support, otherwise 
reasonings are not complete. And keeping trace of all the rules that must be 
added when updating the support in a complex modelization soon becomes dif- 
ficult. Now, reification allows to express properties on types and markers. These 
properties can be encoded in libraries of LG rules, which implement various SGs 
semantics, in such a way that it can be invisible to the end-user. The interest 
is that these properties can be inherited, and do not need to be encoded again 
by the end-user. Reification of types and markers indeed adds expressiveness, 
as well in terms of succintness (factorization of rules) as in terms of robustness 
(resistance to changes of data). 

Finally, we point out that our simulation is complete only for LGs verifying 
the “unique type and marker” assumption (see Rem. 1). The problem is we have 
represented knowledge “two entities have the same type” by “the types of these 
two entities are represented by the same node”. Should we want to overcome 
this limitation, we could define a “meta-relation” expressing that knowledge. 
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7.2 An Extended Simple Graphs Model 

The SG model imposes strong syntactic conditions on co-reference links: for ex- 
ample, fusionning two concept nodes having different types or different individual 
markers is impossible in the model, hence the draconian constraints on SG rules, 
designed in such a way that never will co-referent nodes have different types or 
markers. But the deduction mechanism of the {LG, rules} model, applied to the 
simulation of a SG, ignores such constraints. Then a natural question is: “what 
happens if we remove all constraints on co-reference, and make reasonings on the 
simulation of these objects? How does the result of these reasonings translate 
back to the SG model?” 

The first problem is to handle the case of co-identical nodes having different 
types or markers. The set of rules TZd presented in Fig. 10 are an answer to that 
problem. They indicate that if two nodes having type t and t' represent the same 
entity, then the type of this entity is some subtype of t and t'; and the marker 
of a node can be considered as a marker for all its co-ident nodes. We consider 
TZ the library of LG rules consisting of U T^c U U TZd- We can extend the 
(SG, rules} model by dropping off all constraints on co-reference, and simulate 
it (using the transformation 0) into the (LG, rules} model with the library TZ. 

Let us now outline an extended simple graph (ESG) model, that allows one to 
translate back those reasonings. An ESG G = (Vc, Vr, E, label, co-ref), defined 
on a support S, can be seen as a simple graph where: 

— the type of a concept node is a non-empty subset of T^; 

— the marker of a concept node is a subset of I or the generic marker *; 

— there is no constraint on concept nodes that can be declared co-referent; 

— two individual concept nodes x,y are co-ident if marker(a;) rimarker(j/) yf 0. 

Intuitively, a set of types (G, . . . , t^} (as used, by example, in [1]) can be seen 
as the conjunction of types G □ • • • □ and a set of individual markers as aliases 
for the same entity. The partial ordering <e on labels is used for projection: 




Fig. 10. Library TZd of LG-rules handling co-identity without restrictions 



— Let t and t' be two types. Then t <e F iff Vt' G t', 3tj G t such that tj < t'. 

— Let m and m' be two individual markers. Then m <e rn' (and m! <e rn) iff 
m n to' yf 0. Moreover, for every marker to, to <e *. 

An ESG is in normal form if all its co-identity classes are trivial. An ESG 
is put into normal form by fusionning co-identical nodes. The label resulting 
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from the fusion of [t : m] and [t ’ : m ’ ] is [t U t ’ : M] , where M = m U m' if 
both markers are individual, * if both are generic, otherwise M is the individual 
marker m or m' . The deduction system in the {ESG, rules} model is based upon 
normal derivation. Sound and complete FOL semantics can be given, using the 
semantics of the equivalent {LG, rules} model, or directly extending <I> (see [3]). 

This example shows that, not only simulation adds robustness with respect 
to changes in data (e.g. changes in the support, as discussed above) but also with 
respect to changes in the model. We believe that a {LG, rules}-based interpreter, 
provided with macros that describe syntactic translations of graphs, can be the 
basis of a good prototyping and development tool for different graph-based rea- 
soning models. 
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Abstract. Given a finite set C ~ {C \, . . . , Cn} of description logic con- 
cepts, we are interested in computing the subsumption hierarchy of all 
least common subsumers of subsets of C. This hierarchy can be used to 
support the bottom-up construction and the structuring of description 
logic knowledge bases. The point is to compute this hierarchy without 
having to compute the least common subsumer for all subsets of C. We 
will show that methods from formal concept analysis developed for com- 
puting concept lattices can be employed for this purpose. 



1 Introduction 

Knowledge representation systems based on description logics (DL) can be used 
to describe the knowledge of an application domain in a structured and for- 
mally well-understood way. Traditionally, the knowledge base of a DL system is 
built in a top-down manner. First, the relevant concepts of the application do- 
main (its terminology) are formalized by concept descriptions, i.e., expressions 
that are built from atomic concepts (unary predicates) and atomic roles (binary 
predicates) using the concept constructors provided by the DL language. In a 
second step, these concept descriptions are then used to specify properties of 
objects occurring in the domain. The standard inference procedures provided by 
DL systems (like computing the subsumption hierarchy between concepts, and 
testing for implied instance relationships between objects and concepts) support 
this traditional approach to building DL knowledge bases. 

The main problem with the top-down approach is that it presupposes a 
knowledge engineer that is both, an expert in description logics, and in the ap- 
plication domain. Less experienced knowledge engineers encounter (at least one 
of) the following two problems: (1) it is often not clear which are the relevant 
concepts in an application; and (2) even if it is clear which (intuitive) concepts 
should be introduced, it is sometimes difficult to come up with formal definitions 

* This work was partially supported by the Deutsche Forschungsgemeinschaft Grant 
No. GRK 185/3-98. 
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of these concepts in the given DL language. It has turned out that providing ade- 
quate support for overcoming these problems requires additional (non-standard) 
inference procedures for DL systems. 

In [2,3], we propose to support the construction of DL knowledge bases in 
a hottom-up fashion: instead of directly defining a new concept, the knowledge 
engineer introduces several typical examples as objects, which are then auto- 
matically generalized into a concept description by the system. This description 
is offered to the knowledge engineer as a possible candidate for a definition of 
the concept. The task of computing such a concept description can be split into 
two subtasks: computing the most specific concepts of the given objects, and 
then computing the least common subsumer of these concepts. The most spe- 
cific concept (msc) of an object o (the least common subsumer (Ics) of concept 
descriptions C'i,...,C„) is the most specific concept description C expressible 
in the given DL language that has o as an instance (that subsumes Ci, . . . , C„). 
The problem of computing the Ics and (to a limited extent) the msc has already 
been investigated in the literature [5, 9, 2, 3]. 

Here, we will address an additional problem that occurs in the bottom-up 
approach: obviously, the choice of the examples is crucial for the quality of the 
result. If the examples are too similar, the resulting concept might be too spe- 
cific. Conversely, if the examples are too different, the resulting concept is likely 
to be too general. Our goal in this paper is to support the process of choosing 
an appropriate set of objects as examples. Assume that Ci, ... ,Cn are the most 
specific concepts of a given collection of objects oi, . . . ,o„, and that we intend 
to use subsets of this collection for constructing new concepts. In order to avoid 
obtaining concepts that are too general or too specific, it would be good to 
know the position of the corresponding Ics in the subsumption hierarchy of all 
least common subsumers of subsets of {Ci, . . . ,C„}. Since there are exponen- 
tially many subsets to be considered, and (depending on the DL language) both, 
computing the Ics and testing for subsumption, can be expensive operations, 
we want to obtain complete information on how this hierarchy looks like with- 
out computing the least common subsumers of all subsets of {Ci, . . . , C„}, and 
without explicitly making all the subsumption tests between these least common 
subsumers. 

This is where methods from formal concept analysis [11] come into play. We 
shall show that the dual of the attribute exploration algorithm [10,16,11] (called 
object exploration in the following) can be adapted to our purposes. To be more 
precise, given a formal context, the attribute (object) exploration algorithm 
computes the concept lattice as well as a minimal implication base, the so-called 
(dual) Duquenne-Guigues base [8], of this context. For a given set of concept 
descriptions {Ci, . . . , Cn}, we will define a formal context that has the property 
that its concept lattice is isomorphic to the subsumption hierarchy of all least 
common subsumers of subsets of {Ci, . . . , Cn}- Thus, standard tools for drawing 
concept lattices [19] can be employed to show the hierarchy to the user. In 
addition, the dual Duquenne-Guigues base provides a small representation of 
this hierarchy. From this representation, all subsumption relationships can be 
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deduced in time linear in the size of the representation. To compute the concept 
lattices and the Duquenne-Guigues base, the exploration algorithm employs an 
algorithm for computing the Ics and a decision procedure for subsumption as 
sub-procedures. Although, in the worst case, an exponential number of calls to 
these sub-procedures cannot be avoided, experiences from applications of formal 
concept analysis [18] indicate that the exploration algorithm usually does a lot 
better in practice. 

Another application for this method is structuring of DL knowledge bases. 
DL knowledge bases encountered in applications are often rather broad in the 
sense that a given concept can have a large number of direct successors in the 
subsumption hierarchy. Narrower and thus deeper hierarchies would be more 
convenient when browsing the knowledge base along the hierarchy, and they 
would make searching more efficient. Given a concept C with direct sub-concepts 
{Cl, . . . ,Cn}, one could use least common subsumers of selected subsets to pro- 
vide a better structuring of the knowledge base by inserting additional layers 
between C and its sub-concepts Ci, . . . ,Cn- Again, knowing the hierarchy of all 
least common subsumers of subsets of {C\, . . . ,Cn} can support the knowledge 
engineer in choosing the right subsets. 

In the next section, we introduce the DL languages ACS and ACM, define 
the subsumption problem as well as the notion “least common subsumer”, and 
recall results from the literature for deciding subsumption and computing the 
least common subsumer. In Section 3, we introduce as many of the basic no- 
tions of formal concept analysis as are necessary for our purposes. In particular, 
we sketch the object exploration algorithm. Section 4 applies this technique to 
our problem of computing the subsumption hierarchy between least common 
subsumers of subsets of a given collection of concepts. In Section 5, we provide 
some experimental results that support our thesis that object exploration is an 
appropriate tool for this purpose. Section 6 concludes with some comments on 
related and future work. 



2 The Description Logics A.CS and A.CJ\f 

For the purpose of this paper, it is sufficient to restrict the attention to the 
formalism for defining concept descriptions (i.e., we need not introduce TBoxes, 
which allow to abbreviate complex descriptions by names, and ABoxes, which 
introduce objects and their properties). In order to define concepts in a DL 
knowledge base, one starts with a set Nq of concept names (unary predicates) 
and a set Nr of role names (binary predicates), and defines more complex concept 
descriptions using the operations provided by the DL language of the particular 
system. In this paper, we consider the languages ACS and ACAf,^ which allow 
for concept descriptions built from the indicated subsets of the constructors 

^ It should be noted, however, that the methods developed in this paper apply to 
arbitrary concept descriptions langnages, as long as they are equipped with a sub- 
sumption algorithm and an algorithm for computing least common subsumers. 
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Table 1. Syntax and semantics of concept descriptions. 



name of constructor 


Syntax 


Semantics 


ACS 


ACM 


top-concept 


T 




X 


X 


bottom-concept 


T 


0 


X 


X 


primitive negation 


-^P 


zFvF 


X 


X 


conjunction 


cr\D 


cFrfW 


X 


X 


value restriction 


Vr.C 


{xG \ 'iy: (x, y) G r^ ^ y G C-^} 


X 


X 


existential restriction 


3r.C 


{xGA-^\3y: (*, i/) € A y € C^} 


X 




number restriction 


(> n r) 


{xG A^\ #{y G 1 {x, y)GP}>n} 




X 


number restriction 


(< n r) 


{xG A^ \ #{y G A^ 1 {x, y)GP}<n} 




X 



shown in Table 1 . In this table, P stands for a concept name, r for a role name, 
n for a nonnegative integer, and C, D for arbitrary concept descriptions. 

The semantics of concept descriptions is defined in terms of an interpretation 
X = The domain of X is a non-empty set and the interpretation 

function maps each concept name P G Nq to a set P^ C and each role 
name r G Nn to a binary relation r^ C A^xA^. The extension of to arbitrary 
concept descriptions is inductively defined, as shown in the third column of 
Table 1. 

One of the most important traditional inference services provided by DL- 
systems is computing subconcept/superconcept relationships (so-called subsump- 
tion relationships) between concept descriptions. The concept description C 2 
subsumes the concept description Ci {Ci C C 2 ) iff Of C Of for all interpre- 
tations I; O 2 is equivalent to Oi (Oi = O 2 ) iff Oi O O 2 and O 2 E Oi. The 
subsumption relation C is a quasi order (i.e., reflexive and transitive), but in 
general not a partial order since it need not be antisymmetric (i.e., there may 
exist equivalent description that are not syntactically equal) . As usual, the quasi 
order C induces a partial order C= on the equivalence classes of concept descrip- 
tions: 

[Oi]^ [C2h iff Oi E O 2 , 

where [Ci]= := {D \ Ci = D} is the equivalence class of Oj {i = 1,2). When 
talking about the subsumption hierarchy of a set of descriptions, one means this 
induced partial order. 

Deciding subsumption between A£Af-concept descriptions is polynomial [4] , 
whereas the subsumption problem for ACS is NP-complete [6]. 

In addition to subsumption, we are interested in the non-standard inference 
problem of computing the least common subsumer of concept descriptions. 

Definition 1 Given n > 2 concept descriptions C\, . . . ,Cn in a DL language 
L, the concept description C of C is an Ics of C\, . . . ,Cn (C = \cs{Ci, . . . ,C„)) 
iff (i) Ci C C for all \ < i < n, and (ii) C is the least concept description with 
this property, i.e., if C satisfies Ci E C' for all 1 < i < n, then C Q C . 

As an example, consider the ACS-caacept descriptions 

C := 3has-child.T □ Vhas-child.(Male □ Doctor) and 
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D := 3has-child.(Male □ Mechanic) □ zlhas-child. (Female □ Doctor) 

respectively describing parents whose children are all male doctors, and parents 
having a son that is a mechanic and a daughter that is a doctor. The Ics of C 
and D is given by the ^£f-concept description 

lcs(C, D) = 3has-child.Male □ 3has-child. Doctor. 

It describes all parents having at least one son and at least one child that is a 
doctor (see [3] for an algorithm for computing such an Ics). 

Depending on the DL under consideration, the Ics of two or more descriptions 
need not always exist, but if it exists, then it is unique up to equivalence. It is 
also easy to see that one can restrict the attention to the problem of computing 
the Ics of two concept descriptions, since the Ics of n > 2 descriptions can be 
obtained by iterated application of the binary Ics operation. 

In [3], we have shown that the Ics of two ^£f-concept descriptions always 
exists and that it can effectively be computed; however, the size of the Ics can 
be exponential in the size of the input descriptions. For ACAf, the Ics of two 
descriptions also always exists and it can be computed in polynomial time. In 
addition, the size of the Ics is polynomial in the size of the input descriptions, 
even if one considers n > 2 descriptions (these results for ACAf can easily be 
obtained by restricting the results in [2] to the acyclic case). 

3 Formal Concept Analysis 

We shall introduce only those notions and results from formal concept analysis 
that are necessary for our application. We will describe how the object explo- 
ration algorithm works, but note that explaining why it works is beyond the 
scope of this paper (see [11] for more information on formal concept analysis). 

Definition 2 A formal context is a triple 1C = {0,V,S), where O is a set of 
objects, V is a set of attributes (or properties), and S C O xV is a relation that 
connects each object o with the attributes satisfied by o. 

Let /C = {0,V,S) be a formal context. For a set of objects A C O, the intent 
A' of A is the set of attributes that are satisfied by all objects in A, i.e., 

A' := {p G V I Va e A: (a,p) G 5}. 

Similarly, for a set of attributes B QV, the extent B' of B is the set of objects 
that satisfy all attributes in B, i.e., 

B' ■= {o G O \ \/b G B: {o,b) G S}. 

It is easy to see that, for Ai C A 2 C O (resp. Bi C B 2 QV), we have 

— ^2 — (resp. B '2 C B[), 

- AiQ A'( and A[ = A'(' (resp. Bi C B'f and B[ = B'f'). 
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A formal concept is a pair (A, B) consisting of an extent A Q O and an intent 
B f-V such that A' = B and B' = A. Such formal concepts can be hierarchically 
ordered by inclusion of their extents, and this order induces a complete lattice, 
the concept lattice of the context. Given a formal context, the first step for 
analyzing this context is usually to compute the concept lattice. 

The following are easy consequences of the definition of formal concepts and 
the properties of the • ' operation mentioned above: 

Lemma 3 All formal concepts are of the form {A” ,A') for a subset A ofO, and 
any such pair is a formal concept. In addition, (A",A'j^) < {A'^tA^) iff A'2 C A[. 

Thus, if the context is finite, the concept lattices can in principle be computed 
by enumerating the subsets A of O, and applying the operations •' and 
However, this naive algorithm is usually very inefficient. In many applications 
[18], one has a large (or even infinite) set of objects, but only a relatively small 
set of attributes. In such a situation, Ganter’s attribute exploration algorithm 
[10,11] has turned out to be an efficient approach for computing the concept 
lattice. 

In the application considered in this paper, we are faced with the dual sit- 
uation: the set of attributes will be the infinite set of all possible concept de- 
scriptions of the DL under consideration, and the set of objects will be the finite 
collection of concept descriptions for which we want to compute the subsumption 
hierarchy of least common subsumers. Gonsequently, we must dualize Ganter’s 
algorithm and the notions on which it depends. Alternatively, we could have 
considered the dual context (which is obtained by transposing the matrix cor- 
responding to iS) and employed the usual attribute exploration for this context. 
The correctness of the dual version is an immediate consequence of the fact that 
its results coincide with the results of the usual algorithm on the dual context. 



Object Exploration 

Before we can describe the dual version of Ganter’s algorithm, called object 
exploration in the following, we must introduce some notation. The most im- 
portant notion for the algorithm is the one of an implication between sets of 
objects. Intuitively, such an implication Ai — >■ A 2 holds if any attribute satisfied 
by all elements of Ai is also satisfied by all elements of A 2 . 

Definition 4 Let K. be a formal context and Ai, A2 be subsets ofO. The object 
implication Ai — >■ A2 holds in 1C (1C \= Ai ^ A2) iff A( C A^. An attribute p 
violates the implication Ai — >■ A2 iff p C A( \ A^. 

It is easy to see that an implication Ai — >■ A 2 holds in /C iff A 2 C A". In 
particular, given a set of objects A, the implications A — >■ (A" \ A) always holds 
in /C. We denote the set of all object implications that hold in /C by Jmpg,(/C). 
This set can be very large, and thus one is interested in (small) generating sets. 
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Definition 5 Let J he a set of object implications, i.e., the elements of ff are 
of the form Ai — >■ A2 for sets of objects Ai,A2 C O. For a subset A of O, the 
implication hull of A with respect to J is denoted by J{A). It is the smallest 
subset H of O such that 

— A C H, and 

— Ai ^ A2 & J and Ai C H imply A2 C H . 

The set of object implications generated by ff consists of all implications Ai — >■ 
A2 such that A2 C J[Ai). It will he denoted by Cons{J). We say that a set 
of implications J is a base of IinpQ^IC) iff Cons{J) = Jmpgi(/C) and no proper 
subset of J satisfies this property. 

If is a base for Imp^ (/C), then it can be shown that A” = J {A) for all AQ O. 
The implication hull J {A) of a set of objects A can be computed in time linear 
in the size of ff and A using , for example, methods for deciding satisfiability of 
sets of propositional Horn clauses [ 7 ]. Consequently, given a base J for Jmpgi(/C), 
any question of the form “Hi — >■ H2 G Impg)(/C)?” can be answered in time linear 
in the size of U {Hi — >• H2}. 

There may exist different implication bases of ImpQ (/C) , and not all of them 
need to be of minimal cardinality. A base ff of Imp^flC) is called minimal base 
iff no base of ImpQ{]C) has a cardinality smaller than the cardinality of J. 
Duquenne and Guigues have given a description of such a minimal base [ 8 ] for 
the dual case of attribute implications. Canter’s attribute exploration algorithm 
computes this minimal base as a by-product. In the following, we introduce the 
dual Duquenne- Guigues base and show how it can be computed using the object 
exploration algorithm. 

The definition of the dual Duquenne-Guigues base given below is based on 
a modification of the closure operator A^ J (H) defined by a set J of object 
implications. For a subset H of O, the implication pseudo-hull of H with respect 
to J is denoted by J*{A). It is the smallest subset H of O such that 

— A C H, and 

— Hi — >• H2 G 77 and Ai C H (strict subset) imply H2 C H. 

Given J , the pseudo-hull of a set AGO can again be computed in time linear 
in the size of J and H (e.g., by adapting the algorithm in [ 7 ] appropriately). A 
subset H of O is called pseudo-closed in a formal context 1 C iff ImpQ{IC)* {A) = A 
and A" yf H. 

Definition 6 The dual Duquenne-Guigues base of a formal context K. consists 
of all object implications Hi — >■ H2 where Ai C O is pseudo-closed in K. and 
H2 = H"\Hi. 

When trying to use this definition for actually computing the dual Duquenne- 
Guigues base of a formal context, one encounters two problems: 

1 . The definition of pseudo-closed refers to the set of all valid implications 
Jmpg)(/C), and our goal is to avoid explicitly computing all of them. 
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2. The closure operator A i— >■ A" is used, and computing it via A' ^ A" 
may not be feasible for a context with an infinite set of attributes. 

Ganter solves the first problem by enumerating the pseudo-closed sets of /C in 
a particular order, called lectic order. This order makes sure that it is sufficient 
to use the already computed part J of the base when computing the pseudo- 
hull. To define the lectic order, fix an arbitrary linear order on the set of objects 
O = {oi, . . . , On}, say Oi < • • • < o„. For all j, I < j < n, and Ai, A 2 C O we 
define 



Ai <j A 2 iff Oj (z A 2 \ Ai and Ai n {oi, . . . , oj—i} — A 2 H {oi, . . . , oj—i}. 

The lectic order < is the union of all relations <j for j = 1, . . . , n. It is a linear 
order on the powerset of O. The lectic smallest subset of O is the empty set. 

The second problem is solved by constructing an increasing chain of finite 
subcontexts of JC. The context /Cj = {Oi,Vi,Si) is a suhcontext of JCiA Oi = O, 
V, C 'P , and iSi = iS n {O x Pi). The closure operator A 1 — >■ A" is always 
computed with respect to the current finite subcontext /C^. To avoid adding a 
wrong implication, an “expert” is asked whether the implication A — >■ A" \ A 
really holds in the whole 1C. If it does not hold, the expert must provide a 
counterexample, i.e., an attribute p from P \ Pi that violates the implication. 
This attribute is then added to the current context. Technically, this means that 
the expert must provide an attribute p, and must say which of the objects of O 
satisfy this attribute and which don’t. 

The following algorithm computes the set of all extents of formal concepts of 
/C as well as the dual Duquenne-Guigues base of 1C. The concept lattice is then 
given by the usual inclusion ordering between the extents. 



Algorithm 7 (Object exploration) Initialization: One starts with the empty 
set of object implications, i.e., Jq := 0, the empty set of concept extents £q := 0, 
and the empty subcontext ICq of 1C, i.e., Pq := 0. The lectic smallest subset of O 
is Aq := 0. 

Iteration: Assume that K-i, Ji, Si, and Ai (i > 0) are already computed. 
Compute A'f with respect to the current subcontext ICi . Now the expert is asked 
whether the implication Ai — >■ A” \ Ai holds in K..^ 

If the answer is “no”, then let pi € P be the counterexample provided by the 
expert. Let := Ai, f7i+i := Ji, and let /Ci+i be the subcontext of K. with 

'Pi+l ■= 'Pi U {Pi}. 

If the answer is “yes”, then ICi+i := ICi and 

(c q \ f (Si, JiU {Ai — >• A'l \ Ai}) if A'l yf Ai, 

^ if A'l = A,. 

To find the new set we start with j = n, and test whether 



(*) A, <, J7,*+l((^^n{ol,...,o,_l})U{o,}) 



^ If A” \Ai — then it is not really necessary to ask the expert because implications 
with empty right-hand side hold in any context. 
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holds. The index j is decreased until one of the following cases occurs: 

(1) j = 0; In this case, £i+i is the set of all concept extents and J7i+i the dual 
Duquenne-Guigues base of 1C, and the algorithm stops. 

(2) (*) holds for j > 0; In this case, := fl {oi, . . . , Oj-i}) U {oj}), 

and the iteration is continued. 

4 Computing the Hierarchy of Least Common Subsumers 



Given a finite set O := {C\, . . . ,Cn} of concept descriptions, we are interested 
in the subsumption hierarchy between all least common subsumers of subsets 
of O. For sets A C O oi cardinality > 2, we have already defined the notion 
lcs(4). We extend this notion to the empty set and singletons in the obvious 
way: lcs(0) := _L and lcs({Cj}) := Cj. 

Our goal is to compute the subsumption hierarchy between all concept de- 
scriptions lcs(4) for subsets A oi O without explicitly computing all these least 
common subsumers. This is achieved by defining a formal context with objects 
O such that the concept lattice of this context is isomorphic to the subsumption 
hierarchy we are interested in. 

Definition 8 Given a DL language C and a finite set O := {C'i,...,C„} of 
L-concept descriptions, the corresponding formal context K.c{0) = {0,V,S) is 
defined as follows: 

0:={Ci,...,C„}, 

V := {D \ D is an L-concept description} , 

S:={{C,D) \CLD}. 

As an easy consequence of the definition of ICc{0) and of the Ics, we obtain that 
the intent of a set A C O is closely related to the Ics of this set: 

Lemma 9 Let Ai, A 2 be subsets of O. 

1. A' = {D£V\ 1cs(A) C D}; 

2. A} C A '2 iff\cs{A 2 ) C lcs(Ai); 

3. the implication Ai — >■ A 2 holds in ICc{0) iff\cs{A 2 ) C lcs(Ai). 

As an immediate consequence of 3. of this lemma, the dual Duquenne-Guigues 
base J of K,c{0) yields a representation of all subsumption relationships of the 
form lcs(Ai) C lcs(A 2 ) for subsets Ai, A 2 of O. Given this base fl, any question 
of the form “lcs(Ai) C lcs(A 2 )?” can then be answered in time linear in the size 
of d U{Ai — >■ A 2 }. Another easy consequence of the lemma is that the concept 
lattice of ICc(O) coincides with the subsumption hierarchy of all least common 
subsumers of subsets of O. 

Theorem 10 The concept lattice of tCc{0) is isomorphic to the subsumption 
hierarchy of all least common subsumers of subsets ofO. 
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Proof. We define the mapping tt from the formal concepts of JCc{0) to the set 
of (equivalence classes of) least common subsumers of subsets of O as follows: 

:= [lcs(yl)]=. 

For formal concepts (Ai,Bi), (^ 2 ,^ 2 ) we have (Ai,Bi) < (^ 2 ,- 62 ) iff Ai C A 2 
iff A'2 C A'^ iff lcs(^i) C 105(24.2). As an easy consequence we obtain that tt is 
order preserving (and thus also injective): (Ai,i?i) < (A2,i?2) iff [lcs(Ai)]= C= 
[Ics(A2)] = . 

It remains to be shown that tt is surjective as well. Let A be an arbitrary 
subset of O. We must show that [lcs(A)]= can be obtained as an image of the 
mapping tt. By Lemma 3, {A" ,A') is a formal concept, and thus it is sufficient 
to show that lcs(A) = lcs(A"). Obviously, A C A” implies lcs(A) C lcs(A") 
(by definition of the Ics). To see the converse, note that, for all Ci G O, we have 
Ci G A!' iff C, C £1 for all D G A' (def. of • ' and K.c(P)) 

iff Ci Q D for all D such that lcs(A) C D (Lemma 9) 

iff Ci C lcs(A). (def. of the Ics) 

Obviously, this implies lcs(A") C lcs(A). □ 

If we want to apply Algorithm 7 to compute the concept lattice and the 
dual Duquenne-Guigues base, we need an “expert” for the context lCc{0). This 
expert must be able to answer the questions asked by the object exploration 
algorithm, i.e., given an object implication A\ — >■ A 2 , it must be able to decide 
whether this implication holds in lCc{C>). If the implication does not hold, it 
must be able to compute a counterexample, i.e., an attribute p G \ A^. 

If the language C is such that the Ics is computable and subsumption is 
decidable (which is, e.g., the case for C = ACS or £ = ACM), then we can 
implement such an expert. 

Lemma 11 Given a subsumption algorithm for C as well as an algorithm for 
computing the Ics of a finite set of C-concept descriptions, these algorithms can 
be used to obtain an “expert” for the context ICc{0). 

Proof. First, we show how to decide whether a given object implication Ai — >■ A 2 
holds in ICc{0) or not. By Lemma 9, we know that Ai — >■ A 2 holds in ICc{0) iff 
Ics(A 2 ) C lcs(Ai). Obviously, lcs(A 2 ) C lcs(Ai) iff Q G lcs(Ai) for all Ci G A 2 . 
Thus, to answer the question “Ai — >■ A 2 ?”, we first compute lcs(Ai) and then use 
the subsumption algorithm to test whether Ci C lcs(Ai) holds for all Ci G A 2 . 

Second, assume that Ai — >• A2 does not hold in ICc{0), i.e., lcs(A2) % lcs(Ai). 
We claim that lcs(Ai) is a counterexample, i.e., lcs(Ai) G A[ and lcs(Ai) ^ A^. 
This is an immediate consequence of the facts that A' = {D G V \ lcs(Ai) C D} 
(i = 1,2) and that lcs(A2) 2 lcs(Ai). 

Of this counterexample. Algorithm 7 really needs the column correspond- 
ing to this attribute in the matrix corresponding to S. This column can easily 
be computed using the subsumption algorithm: for each Ci G O, we use the 
subsumption algorithm to test whether Ci C lcs(A) holds or not. □ 
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Using this expert, an application of Algorithm 7 yields 

— all extents of formal concepts of lCc{0), and thus the concept lattice of 
lCc{0), which coincides with the subsumption hierarchy of all least common 
subsumers of subsets of O (by Theorem 10); 

— the dual Duquenne-Guigues base of K.c{0), which yields a compact repre- 
sentation of this hierarchy (by 3. of Lemma 9); and 

— a finite subcontext of JCc{0) that has the same concept extents as lCc{0) 
and the same • " operation on sets of objects. 

Using the output of Algorithm 7, one can then employ the usual tools for drawing 
concept lattices [19] in order to present the subsumption hierarchy of all least 
common subsumers of subsets of O to the knowledge engineer. 

5 First Experimental Results 

In the previous section, we have shown that the object exploration algorithm 
can be used to compute the hierarchy of least common subsumers of a given set 
of concept descriptions. What remains is to analyze whether object exploration 
really is a good approach for solving this task. Our reason for trying it in the 
first place was that computing this hierarchy is the same as computing a certain 
concept lattice (as shown above), and that Ganter’s algorithm is known to be 
a very good method for doing this. The problem with this generic argument in 
favour of object exploration is, of course, that we consider a very specific context, 
and that it might well be that, for this context, object exploration is not the 
best thing to do. The first experimental results that will be described below are, 
however, rather encouraging. 

We intend to use the bottom-up construction of knowledge bases in a chemical 
process engineering application [13,15], where the knowledge base describes stan- 
dard building blocks of process models (such as certain types of reactors). Gur- 
rently, this knowledge base consists of about 600 definitions of building blocks, 
which we translate into ALE-caac&pt descriptions. 

In order to test the object exploration algorithm, we have taken 7 descriptions 
of reactors of a similar type, which the process engineers considered to be good 
examples for generating a new concept. These descriptions were translated into 
A££l-concept descriptions Ri, . . . , Rt, and we applied the object exploration al- 
gorithm to this set of objects. The resulting concept lattice, which coincides with 
the hierarchy of all least common subsumers of subsets of O := {Ri, ■ ■ ■ ,Rr}, 
is depicted in Figure 1. The top concept corresponds to the Ics obtained from 
the whole set of examples, and the bottom concept is the Ics obtained from the 
empty set, i.e., the description T. The node labelled lcs(ii . . . im) corresponds to 
the formal concept with extent , . . . , RimJ ^nd thus to lcs(i?i^ , . . . , Rim) - Note 
that in many cases \cs{Ri ^, . . . , Ri^) can also be obtained as the Ics of a strict 
subset of {i?ij , . . . , Ri^ }. This can be easily seen by using the least-upper-bound 
operation of the concept lattice. For example, \cs{Ri,R-i) = lcs(i?i, . . . , i^y) for 
all i, 1 < z < 6. 




Building and Structuring Description Logic Knowledge Bases 303 




Fig. 1. The hierarchy of least common snbsumers of seven reactor descriptions. 



Statistical information: The Duquennes-Guigues base of the context consists of 
15 implications, and the concept lattice of 30 formal concepts. If we subtract the 
trivial least common snbsumers _L, i?i, . . . , as well as lcs(i?i, . . . , Ry), which 
turned out to be equivalent to an already existing description, we end up with 21 
candidates for new concepts. Of these 21 interesting least common subsumers, 
only 10 have explicitly been computed during the exploration. 

During the calls of the “expert” , 255 subsumption tests and 25 n-ary Ics oper- 
ations have been executed. Because we re-used already computed least common 
subsumers, the 25 n-ary Ics operations only required 25 binary Ics operations. 
The number of counterexamples computed by the expert was also 25. 

Finally, we measured the time needed for executing the interesting subtasks, 
namely computing the Ics, testing subsumption, and realizing the overhead in- 
troduced by the object exploration algorithm (e.g., computing the •" operation, 
the pseudo- hull, etc) . It turned out that more than 84% of the time was used for 
computing least common subsumers, 15% for subsumption tests, and less than 
1% for the rest. This shows that, at least for this small example, the exploration 
algorithm does not introduce any measurable overhead. The fact that computing 
the Ics needed a lot more time than testing subsumption is probably due to the 
fact that we used a highly optimized subsumption algorithm [12], but only a 
first prototypical implementation of the Ics algorithm. 
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What can he learned from the eoncept lattice? Two important facts about the 
reactor descriptions can be read off immediately. First, there is no subsumption 
relationship between any of the 7 concepts since all singleton sets occur as ex- 
tents. Second, Reactor 7 is quite different from the other reactors since its Ics 
with any of the others yields a very general concept description. Thus, it should 
not be used for generating new concepts together with the other ones. In fact, a 
closer look at Rj revealed that, though it describes a reactor of a type similar to 
that of the other ones, this description was given on a completely different level 
of abstraction. 

Next, let us consider the question of which of the least common subsumers 
occurring in the lattice appear to be good candidates for providing an interesting 
new concept. First, the Ics of the whole set is ruled out since it involves Reactor 7, 
which does not fit well with the other examples (see above). Second, in order 
to avoid concepts that are too specific, least common subsumers that do not 
cover more than half of the reactors should also be avoided. If we use these two 
criteria, then we are left with 9 candidates (the formal concepts with extents of 
cardinality 4, 5, and 6), which is a number of concepts that can well be inspected 
by the process engineer. In our example, the 5 least common subsumers on the 
first layer of these interesting candidates (the formal concepts with extents of 
cardinality 4) turned out to be the most promising, though this must still be 
checked in more detail with the process engineers. 

6 Related and Future Work 

The idea of using tools from formal concept analysis in description logics is not 
new. In [I], the attribute exploration algorithm was used to compute a small 
representation of the subsumption hierarchy of all conjunctions of concepts de- 
fined in a terminology, and in [17] this approach was extended such that it could 
handle both, conjunction and disjunction. There are, however, significant differ- 
ences to the approach considered in the present paper. First, the formal context 
defined in [1,17] is quite different from the one introduced above: its objects are 
pairs consisting of an interpretation X and an element of . To be able to com- 
pute the counterexamples required by the attribute exploration algorithm, the 
subsumption algorithm had to be extended such that it computes appropriate fi- 
nite countermodels [1]. Second, [1] was only interested in the Duquennes-Guigues 
base computed by the algorithm, and not in the concept lattice. In fact, in the 
experiments made with the approach introduced in [1], the base turned out to 
be usually rather small, whereas the lattice was very large (and thus it did not 
make sense to visualize it) [14]. 

In the future we will test the approach introduced in this paper with more 
examples from our chemical process engineering application. In particular, we 
will more closely analyze which of the least common subsumers yield concepts 
that make sense in the application domain, with the goal to develop appropriate 
heuristics for suggesting good candidates based on the computed concept lattice. 
In addition, we will try to optimize the method, with the goal to avoid even more 
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explicit Ics computations by using information from the partial implication base 

and the subcontext already computed. 
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Abstract. This paper reports on first attempts to develop a contextual 
logic of ordinal data. The investigations are based on a mathematical 
theory of ordinal contexts which has been developed within Formal Con- 
cept Analysis. From ordinal contexts, binary power context families are 
derived as semantic basis of a contextual logic of ordinal data. They are 
used to characterize compound attributes extensionally. In this way, the 
contextual logic becomes a relational logic within the framework of the 
Peircean Algebraic Logic, as reconstructed by R. W. Burch. The con- 
siderations of this papers are discussed through an example of ordinal 
data investigated toward a meaningful representation in ordered vector 
spaces. 
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1 Ordinal Structnres 

The aim of this paper is to show how methods of Contextual Logic (cf. [Wi97]), 
[Pr98], [GW99b]) may be introduced into Ordinal Data Analysis to support 
meaningful applications of mathematical investigations of ordinal structures. 
This shall be understood as a contribution to the development of a basic math- 
ematical theory of ordinal data which is necessary for obtaining relevant inter- 
pretations of ordinal data. 

According to [Co64], there are two main types of relations between objects 
in empirical data: dominance (order) and proximity (similarity). With respect 
to this classification, ordinal data may be understood as representations of dom- 
inances. Although ordinal data are basic and occur in high frequency, the devel- 
opment of a comprehensive structure theory for ordinal data only started in the 
early 1990’s (see [SW92]). An obstacle for the development of a comprehensive 
structure theory was the absence of a basic notion of ordinal structures. For in- 
troducing a mathematical definition of ordinal structures. Representational Mea- 
surement Theory was taken as an adequate framework (cf. [KLST71], [Ro79]). 
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In Representational Measurement Theory, empirical data are generally modelled 
by relational structures which consists of a set and a family of relations on this 
set; in this setting, dominances are usually mathematized by quasi-orders. Let 
us recall that a quasi-order ;$ on a set S' is a reflexive transitive binary relation 
between the elements of S, i.e., x ^ x for all x € S, and x y and y ^ z 
imply a: ;$ z for all x,y,z € S. In [SW92], ordinal structures have been specified 
by those relational structures in which all relations are quasi-orders. Formally, 
an ordinal structure is a relational structure (S, (^m)meM) for which the 
(to G M) are quasi-orders; in the case of finitely many quasi-orders, an ordinal 
structure is usually described by (S, ;$i, . . . , 

Ordinal structures are often given by object-attribute-tables as in the ex- 
ample of Figure 1: the table shows data from [SF68] describing the amount of 
absorption for four colour stimuli in the retina of a goldfish. For each colour 



Receptor 


Violet 

430 


Blue 

458 


Blue 

485 


Blue-Green 

498 


1 


147 


153 


89 


57 


2 


153 


154 


no 


75 


3 


145 


152 


125 


100 


4 


99 


101 


122 


140 


5 


46 


85 


103 


127 


6 


73 


78 


85 


121 


7 


14 


2 


46 


52 


8 


44 


65 


77 


73 


9 


87 


59 


58 


52 


10 


60 


27 


23 


24 


11 


0 


0 


40 


39 



Fig. 1. Amounts of absorption of four colour stimuli by receptors in the retina of a 
goldhsh 



stimulus TO, we obtain a quasi-order on the set S of all receptors as follows: 
g fSm h if the colour stimulus to does not yield a greater value for the receptor 
g than for the receptor h. According to this understanding, the table in Fig- 
ure 1 represents an ordinal structure (S', 16430, ;$458, ~48S; ~49s)- Notice that the 
quasi-orders are induced by an order on the attribute values. Thus, a mathe- 
matical model of object-attribute-tables for describing ordinal structures should 
include orderings on the sets of attribute values. An ordinal context is there- 
fore defined as a structure K := {G,M, where G and M are 

sets, (Wm, <m)m&M is a family of ordered sets, and I is a ternary relation with 
I Q UmeM ^ ^ ^ such that, for each g £ G and to G M, there is 

exactly one w G Wm with {g,m,w) G I (see [Wi82]). The elements of G, M, 
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and Wm (m € M) are called objects, attributes, and attribute values, respec- 
tively; if {g, m,w) € I we say: the object g has the value w for the attribute 
TO. An attribute to of K can be understood as a mapping from G into Wm', 
therefore we often write m{g) = w instead of {g, to, w) G I. In many cases, 
the attribute value sets of an ordinal context are the same set W and even the 
orderings might be equal; then we write for K also {G,M,{W,<m)m£M,I) or 
even {G, M, (W, <), I). For an ordinal context K := {G, M, (Wm,<m)meM, I), 
the corresponding ordinal structure is defined by S^(K) := (G, ($m)meM) where 
the quasi-orders are determined by g h :<t^ rn{g) <m rn{h). Conversely, 
every ordinal structure can be derived in this way from an ordinal context. 

The purpose of a structure theory for ordinal data is to support meaningful 
interpretations of the data. Since interpretations are always based on concepts 
and their relationships, suitable conceptual structures are assigned to ordinal 
structures resp. ordinal contexts by methods of Formal Concept Analysis (for 
its basic notions and results, we refer to [GW99a]). For an ordinal context K := 
{G,M, {Wm,^m)meM, I), the most elementary method for such an assignment 
is the plain ordinal scaling defined by forming the derived formal context Kq := 
{G, UmGM {to} X Wm,Io) with gIo{m,w) :<t^ rn{g) < w and its concept lattice 
(cf. [GW89]). Figure 2 represents the formal context derived from the data table 
in Figure 1 by plain ordinal scaling. The concept lattice of this context is shown 
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Fig. 2. The formal context derived from Figure 1 by plain ordinal scaling 



in Figure 3. The attribute concepts derived from the first colour stimulus (Violet 
430) form a chain located on the left side of the diagram, while the chain of the 
attribute concepts derived from the fourth colour stimulus (Blue-Green 498) 
is on the right side; the chains of the other two stimuli (Blue 458 and Blue 
485) are located in between in correspodence to the order of the wave lengths 
430 < 458 < 485 < 498. Is the recognition of this betweeness independent of 
the choice of the specific diagram representing the concept lattice? Since there 
is no adequate notion of betweeness of chains in lattices, the question arises how 
one can derive an ordinally invariant notion of betweeness in ordinal structures. 





On the Contextual Logic of Ordinal Data 



309 




Fig. 3. The concept lattice of the formal context in Figure 2 



which allows us, in particular, to justify, by the ordinal data in Figure 1, that 
the betweeness of the colour stimuli, according to their wave length, is perceived 
by the goldfish. 

In Representational Measurement Theory an ordinally invariant notion of be- 
tweeness is derived for ordinal structures by meaningful representations of ordinal 
structures in ordered vector spaces. An elementary approach to such representa- 
tions is described in [WW96] (see also [W195]). The basic step toward an ordered 
vector space representation consists of constructing an addition starting, with- 
out loss of generality, with an ordinal context (G, {toq, mi, . . . , m„}, (IT, <),!)■ 
Let us assume that we have already an addition -|- on IT such that (IT, -|-,<) 
is an ordered abelian group with mo{g) = mi{g) -I- • • • -I- mn{g) for all g € G. 
Obviously, these equalities have the following properties: if all summands on 
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the right side increase then the left side has to increase, and if all summands 
except one on the right side increase, but the left side decreases, than the re- 
maining summand on the right side has to decrease too. These properties can be 
translated to the following symmetrized conditions for the derived ordinal struc- 
ture (G, ~i) • ■ • ; ~n) for which, as defined above, g h :<t^ iTii{g) < mi{h) 

(i = 1, . . . ,n) but, conversely, g h mo{g) > mo{h): 

(Ai) g h for all j G {0, 1, . . . , n} \ {i} implies g h {i = 0, . . . ,n). 

In the case of the condition (Ai), we say that the quasi-order is anti-ordinally 
dependent of the quasi-orders (j G {0, 1, . . . , n} \ {i}). 

It turns out, that the anti-ordinal dependency conditions allow us to con- 
struct an n-quasi-group operation. For obtaining even an ordered abelian group, 
we need further structural properties. For their formulation, two types of equiv- 
alence relations are defined for ordinal structures {i,j = 0, 1, . . . , n): 

G’i := {{g,h)\g h and h <i g} and := P|{6>fc | fc G {0, 1, . . . , n} \ 

Now, we can define the so-called shifting conditions” (i = 1, . . . ,n with n > 3): 

(Si) g%ik, kOil, l%ih, hO^g, and kOol imply gOoh. 

For an ordinal structure, the conditions of anti-ordinal dependencies and shifting 
guarantee a representation by an ordered vector space, which is meaningful, 
and yields an ordinally invariant notion of betweeness (see [WW95], [WW96]). 
For finding out how to check those kinds of conditions, it is useful to better 
understand the contextual logic of ordinal structures. 

2 Contextual Logic of Binary Power Context Families 

Before discussing the contextual logic of ordinal structures specifically, we outline 
the more general Contextual Logic of Binary Relations (cf. [EGSWOO]). Semanti- 
cally, this logic is based on binary power context families IK := (Kq, Ki, IK 2 ) con- 
sisting of the formal contexts := (Gk,Mk,Ik) {k = 0, 1,2) with Gk C (Gq)^ 
for k = 1,2 (cf. [Wi97]). The formal concepts of Ki and K 2 have as extents 
unary and binary relations and are therefore called unary and binary relational 
concepts, respectively. 

Now, we start to describe, in analogy to [GW99b], the Contextual Attribute 
Logic of binary power context families K by recursively defining compound at- 
tributes for K with the operational elements /\, \/, *, and o: 

— Each attribute m G (fc = 0,1,2) is a compound attribute and also the 
“constants” T^, T k, and id^ with the extents 



(Tfc)"'= := 0, {TkY’‘ := Gk, and {id^Y^ := {{g,g) \ g G Gq}. 
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— For each attribute m G Mk {k = 0, 1,2), we define its negation ~<m to be a 
compound attribute with 

:= Gk \ . 

Thus, g is in the extent of -■m if and only if g is not in the extent of m. 

— For each set of attributes A C {k = 0, 1,2), we define the conjunction 
/\ A and the disjunction \/ A to be compound attributes for which 

{I\^AY*‘ := Pi 771^'“ and (\J AY'‘ := P . 

m£A m£A 

Thus, g is in the extent of /\ Gl if and only if g is in the extent of all attributes 
m G A, and g is in the extent of V ^ if ^md only if g is in the extent of 
some attribute m G A. Important compound attributes are the sequents 
V(«5' U {-■m I m G A}) determined by two subsets A,S C Mk, respectively, 
and denoted by {A, S), shortly . 

— For each binary attribute m G M 2 , we define its conversion *m to be a 
compound attribute with 

:= {{g,h) G G 2 \{h,g) G m^Y- 

Thus, {g, h) is in the extent of *m if and only if (h, g) is in the extent of m. 

— For each set V C *P(Mq) and each two attributes m,n G M 2 , we define the 
V -concatenation mo-p n to be the compound attribute with 

{mop nY'^ :={{g,h) G G2YBav^ia(-^{B,Mo\B)yo{9,l)l2'm arid {l,h)l 2 n}. 

Thus, {g, h) is in the extent of to o-p n if and only if there exists a set B G V 
and an element I G Go such that {g, 1) is in the extent of the attribute to, 
{I, h) is in the extent of n, and I is not in the extent of the sequent {B, Mq\B). 
We write o instead of 

— Iteration of the above compositions leads to further compound attributes, 
the extents of which are determined in the obvious manner. 

The most important operations for binary relations are caputured by the intro- 
duced compound attributes; this should be clear, except for the concatenation. 
Since the notions of contextual attribute logic have to be expressed purely on the 
attribute level, the reference to objects in the definition of the concatenation has 
to be replaced by an attribute description. The following propositions show that 
this may still yield the usual concatenation for binary relations (cf. [GW99b]). 

Proposition 1 In every formal context K := (G, M, I), for g G G and B C M, 
g G {-'{B,M \ B)Y is equivalent to g^ = B, i.e., the object g is, up to object- 
clarification, uniquely determined by the extent {-•{gY M \ gY)^ ■ 

Proof: Since {B, M \B) = \B)U {-ito | to G B}), we obtain 

{^{B,M\B)Y = (A(sn{-TO I TO G M\B})Y = B^nC]{G\m^ \ m G M\B}. 
If g^ = B this yields g G g^^ fl {-^{B,M \ B)Y = \ to-^ | to G M \ g^} 

because g G g^^ and g ^ for each to G M\gG 

Conversely, g G {~'{B,M \ B)Y implies g^ C B, because of 5 ^ for each 
m G M\B, and B C gY because g G B^ and therefore g^ D B^^ A B. □ 
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Proposition 2 Let K := (Ko,IKi,K 2 ) be a binary power context family, let 
So ■= {g^° I 9 G Go}, and let m,n G Mq. Then 

{mogg nY"^ = {{g,h) G \ 3l G Gq \ {g,l) G m^° and (l,n) G n^°}. 

Proof: Let {g,l) G m^° and (l,n) G n^°. Then, by Proposition 1, we have 
I G (~'( 5 o, hence {g, h) G mog^n)^^. Conversely, let {g, h) G mog^n)^^. 

Then there exist a, B G Go and I G {-•{B, M \ B))^° with {g, Vjl^m and {I, h)l 2 n] 
hence (m, n) is in the set on the right side of the asserted equality. □ 

Logically, a basic question concerning binary power context families K := 
(Ko,Ki,]K 2 ) is whether the extents of compound attributes are equal or not. 
Two fc-ary compound attributes m and n are called extensionally equivalent in 
K (in symbols: m n) if they have the same extent in K^; they are said 
to be globally equivalent (in symbols: m « n) if they have the same extent in 
each binary power context family with the same attribute sets Mg, Mi and M 2 - 
Since m ^Kk ^ is equivalent to (m V ~<n) V {-<m A n) ^Kk T fc, we also define 
the notion of all-extensionality: A compound attribute m G Mk {k = 0,1,2) 
is called all-extensional in the binary power context family if = Gk- It is 
of basic interest to find effective methods for deciding the all-extensionality for 
compound attributes of binary power context families. 

The contextual logic of binary context families is seen in the framework of 
the Peircean Algebraic Logic which R. W. Burch created as “an attempt to 
amalgamate various systems of logic that Peirce developed over his long career” 
(see [Bu91]). The relational operations which result from the recursive definition 
of the compound attributes can all be derived from the basic operators of the 
Peircean Algebraic Logic (cf. [EGSWOO]). 

3 Contextual Logic of Ordinal Contexts 

For specifying a contextual logic for an ordinal context 

K := (G, M, {Wm, ^m)meM , I)j 

we transform the ordinal context in a binary power context family by the method 
of relational scaling, which was introduced by [PW99]. The simplest way of 
doing this is to define Kg as the context Kq derived from the ordinal con- 
text K by plain ordinal scaling (see Section 1) and K 2 as the formal context 
Kq := (G^, M, I 2 ) with {g, h)l 2 m :<t^ m{g) <m rn{h) (Ki may be assumed to be 
empty and therefore omitted). The contexual logic of the binary power context 
family Kog := (Ko,Kg) may be understood as the semantic basis of the “canon- 
ical” contextual logic of the ordinal context K. The other methods of ordinal 
conceptual scaling discussed in [SW92] and [GW99a] may produce richer con- 
texts, but the compound attributes still yield the same extents; hence the logical 
expressibility keeps constant. 

An interesting question is how restricted are the binary relations which are 
extents of compound attributes of the binary power context families Kog. It 
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might surprise that one has to expect any binary relation on the set G. This is 
the content of the following proposition: 

Proposition 3 For each binary relation R on G, there is an ordinal context K 
with the object set G such that R is the extent of a compound attribute o/Ko^. 

Proof: Let R be an arbitrary binary relation on G. We regard the ordinal context 
K := (G, M, {Wm, <m),I) with 

M ■= {mg\g G G} U {mgh\g, G G and 5 yf h}, 

Wg := {0, 1} with 0 <g 1, 

Wgh ■■= G with <gh-= id2 U {{g, h)}, 

^ .= / 0 iik = g 

otherwise, 

rrighik) := k. 

(We write Wg instead of Wmg, instead of <mg, and <gh instead of <mgh-) 
Then, in the corresponding formal context := (G^,M, /2), we have 

{mgY^ = ((G \ {5}) X G) U {(5, g)}, 
{^mgY^={g}x{G\{g}), 

{*mgY^ = (G X (G \ {g})) U {{g, g)}, 

(^*m,/= = ((G\{ 5 })x{ 5}. 

Therefore, we obtain {~‘mg o = {(g, g)} and hence 

{y {hmg o ^*mg)\{g, g) G R}Y^ = {{g,g)\{g,g) G R}; 

furthermore, it follows {rnguY'^ = {{k, k)\k G G} U {(5, h)} and from this 
{~^'nighY‘^ = {{k, l)\{k, 1) yf {g, h) and k Y Y and so 

(A{ -^mgh\{g,h) ^ R\Y^ = {{k,l)\{k,l) G R and k yf 1}. 

Finally, we get (\J{{^mg o ^*mg)\{g, g) € R}W /\{^mgh\{g,h) ^ R})^^ = R. □ 

Now, we want to apply the presented language of the contextual logic of or- 
dinal contexts to the spatial representation problem. The question is still open 
whether the ordinal structure given by the ordinal data in Figure 1 has a mean- 
ingful representation in an ordered vector space or not. For a binary power con- 
text family Kog := (Kq, Kg), obtained by relational scaling of an ordinal context, 
the conditions {Ai) of anti-ordinal dependency (i = 0, 1 , . . . , n) lead in general 
to the compound attributes 



Oi := *qi V yi^qk | fc G {0, 1, . . . , n} \ {i}}, 

(<70,....™ G M U *M) such that qi is anti-ordinally dependent of the qk with 
k G {0, 1, . . . , n}\{i} if and only if Oi is all-extensional in Kg. For translating the 
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shifting conditions (Si), it is more convenient to use the equivalent permutability 
conditions {0i fl 6>o) o Woi = 'Foi ° {0i H 6>o) {i = 1, . . . ,n). Let us define the 
following abreviations (i,j = 0 , 1 , ... n): 

e,:=qiA*qi and := /\{efc | fc G {0, 1, . . . , n} \ {i, j}}. 

For the binary power context family Koq and Qo ■= {g^° \ g G Go}, the terms of 
the permutability conditions lead to the compound attributes 

Si := (ci A eo) og^ eoi and U := eoi og^ (a A cq), 

such that {{ciY” A {eo}'^*’) og^ {eoi}^° = {eoi}^° og^ ({ei}^° A {eo}'^°) if and only 
if Si and ti are extensionally equivalent in 

What help do the described compound attributes give to answer the rep- 
resentation question for our example? By relational scaling of the ordinal con- 
text represented in Figure 1, we obtain the binary power context family := 
withG™' := {1, . . . , 11}, := {(430, 0), (430, 14), . . . , (498, 140)}, 

G“* := (G“')^> and M™' := (too, 9458, 9485, <? 49 s}- For our example, the condi- 
tions (Ai) of anti-ordinal dependency lead to the compound attributes 



“■9i V -'9fe V 9* 

with (j,i,k) G 1(430, 458, 485), (430, 458, 498), (430, 485, 498), (458, 485, 498)}, 

such that * 9 i is anti-ordinally dependent of qj and qu if and only if ~'qj\/-<qk\/qi is 
all-extensional in . Obviously, it is not difficult to check the all-extensionality 
algorithmically; hence one can easily confirm that, for any choice of three at- 
tributes in Mq, the described anti-ordinal dependency holds for those attributes. 

Since the shifting conditions are only non-trivial for n > 3, they are of no 
help for anti-ordinal triples of attributes. For this case, the Thomsen condition 
(see [KLST71]) is meaningful, but there are not enough coincidences between 
the values of the attributes in our example, so that a direct application of the 
Thomsen condition does not work either. Nevertheless, after the positive answer 
concerning the anti-ordinal dependencies, it is not difficult to find a representa- 
tion of our data in the euclidean plane; such a representation is shown in Figure 
4 (cf. [WW96]). 

Each colour stimulus is represented by a set of parallel lines: Violet 430 by the 
vertical lines, Blue-Green 498 by the horizontal lines, and the other two colour 
stimuli in between so that part of the colour circle becomes apparent. The visible 
betweeness is independent of the specific representation of the stimuli by sets of 
parallel lines in the euclidean plane as proved in [WW95]. Thus, the ordinal data 
confirm the hypothesis that the goldfisch perceives at least part of the colour 
circle according to the wave lengths of the chosen colour stimuli. 

The example may indicate how the contextual logic of ordinal contexts can 
support the interpretation of ordinal data. Expressing structural properties by 
compound attributes allows us to activate general algorithms for determining 
extensional equivalence or all-extensionality to check those properties in ordinal 
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Blue-Green 498 




Fig. 4. A geometric representation of the ordinal data in Figure 1 in the euclidean 
plane 



contexts. Of course, efficient algorithms have still to be developed in contextual 
logic of ordinal data, but at least it became clear in which directions further 
developments should go. Many research problems should be attacked as, for 
instance, the normal form and the word problem for compound attributes with 
respect to the extensional equivalence. In general, the contextual logic of ordinal 
data should be elaborated within the framework of Peircean Algebraic Logic. 
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Abstract. The aim of this paper is to show how a Boolean Concept 
Logic may be elaborated as a mathematical theory based on Formal Con- 
cept Analysis [GW96]. For this purpose, concept lattices are extended by 
further operations, mainly negation and opposition. Two extensions are 
discussed which lead, on the one hand, to algebras of protoconcepts equa- 
tionally equivalent to double Boolean algebras and, on the other hand, 
to concept algebras quasi-equationally equivalent to dicomplemented lat- 
tices. In both cases, basic representation theorems are proved. These 
results are not only basic for Contextual Concept Logic but also for Con- 
textual .Judgment Logic with its theory of concept graphs. 
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1 Contextnal Concept Logic 

George Boole presented in his book “An Investigation of the Laws of Though!’ 
[Bo54] a mathematical theory of logic based on the notions of signs and classes, 
where signs represent classes of objects. An important aim of his investigation is 
“the derivation of the laws of the symbols of logic from the laws of the operations 
of the human mind” [Bo54; p.39]. Boole argues that the symbolic laws of logic, 
“determined a posteriori from the constitution of language, [...] are in reality 
the laws of that mental operations” which he describes in his book [Bo54; p.44]. 
Mathematically, Boole developed his logic as a theory of symbolic operations 
applied to classes of objects. This approach - today called the Boolean Class 
Logic - became basic for modern mathematical logic. 

In his book [Bo54; p.42], Boole pointed out that, in every discourse, “there 
is an assumed or expressed limit within which the subjects of its operations 
are confined”; he called the field within which all the objects of such discourse 
are found: the universe of discourse. In Contextual Attribute Logic [GW99], in 
which signs are understood as attributes, Boole’s idea of a limited universe of 
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discourse finds its mathematical representation in the notion of a “formal con- 
text”, the basic notion of Formal Concept Analysis (see [GW96]). A formal 
context is defined as a tripel K := (G, M, I) combining two sets G and M with 
a binary relation I between them, i.e. I C G x M; the elements of G and 
M are called objects and attributes, respectively. For Y C M, the derivation 
is defined by Y' := {g G G \ gim for all m € Y} (and dually, for X C G, 
by X' := {m G M \ gIm for all g G AT}); in particular, for m G M, we get 
m' := {to}' = {g G G \ gim}, the so-called attribute extent of to which may 
be understood as the Boolean class belonging to the “sign” to. In [GW99], it 
is outlined how this correspondence may lead to a development of a Contextual 
Attribute Logic in the spirit of Boole’s approach to mathematical logic. 

The aim of this paper is to show how a Boolean Concept Logic may be 
elaborated as part of a Contextual Concept Logic, which is based on the mathe- 
matical theory of formal contexts and concepts presented in [GW96]. In general, 
the Contextual Concept Logic shall be developed together with a Contextual 
Judgment Logic as a mathematization of the traditional philosophical logic with 
the main aim to support conceptual knowledge representation and processing 
(see [Wi96] , [Wi97a] , [Wi97b] , [Pr98] , [GW99] ) . 

A formal concept in Contextual Concept Logic is defined with respect to 
a formal context K := {G,M,I), namely as a pair {A,B) with A C G and 
B Q M such that A' = B and B' = A-, A and B are called the extent and 
the intent of the formal concept (A,B), respectively. The formal concepts of 
K can be derived by forming {X",X') for X C G and {Y' ,Y”) for Y C M, 
respectively, because of X'” = X' and Y' = Y'" . The set Q3(]K) of all formal 
concepts of the formal context K carries the natural subconcept- superconcept- 
order given by {Ai,Bi) < (A 2 ,i? 2 ) Ai C A 2 (<t^ Bi D B 2 ). $(IK) with 
the defined subconcept-superconcept-order is always a complete lattice, called 
the concept lattice of the formal context K and denoted by $(K). Since K can 
be reconstructed from the concept lattice by determining the object concepts 
19 •= {9 G the attribute concepts p,m := ({to}', {to}") 

(to G M), which satisfy 73 < ^to <t7 gim, the Contextual Attribute Logic 
may be also expressed in terms of concept lattices of formal contexts and so be 
understood as part of Contextual Concept Logic. 

For extending the Boolean Attribute Logic to a Boolean Concept Logic, 
the main obstacle is the absence of formal negations on the conceptual level. 
Therefore we first concentrate on the question how to introduce formal negations 
in conceptual structures based on formal contexts. In the history of logic one can 
distinguish essentially four forms of negations (cf. [Me84]): (1) negative act of 
judgment, (2) negation of propositional logic, (3) negative copula, (4) negative 
concept or property. The first three forms of negations have to be discussed 
with respect to Contextual Judgment Logic which is not in the scope of this 
paper. Thus, we will focus on the form of negative concepts and properties. 
Boole understood a negated sign as the represention of the complement of the 
class represented by the original sign in the given universe of discourse; he writes: 
“if from the conception of the universe, as consisting of “men” and “not-men” , 
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we exclude the conception of “men”, the resulting conception is that of the 
contrary class, “not-men”.” [Bo94; p.48] Mathematically, Boole takes, within 
the class of the considered universe, the complement of the class represented by 
the origininal sign as the class of the negated sign. If this is analogously done 
with a formal concept (A,B) in a formal context the problem arises 

that the complement G \ A of the extent A need not to be an extent again (and 
that the complement M \ B of the intent B need not be an intent again) . For 
obtaining the negation as an operation on the set of formal concepts of the formal 
context, we may take as the negation of {A, B) the formal concept generated by 
G\A, namely ((G \ A)" , (G \ A)'). But then we have the disadvantage that the 
extents of the formal concept and its negation might not be disjoint; nevertheless, 
we will investigate further this “weak negation” by studying so-called “concept 
algebras” in Section 4. 

The only possible way for keeping the correspondence between negation and 
set-complement is to generalize the notion of formal concepts. This has already 
been done more than a decade ago by introducing the notion of a semiconcept (see 
[LW91],[Wi92]) which algebraically led to the so-called double Boolean algebras 
(see [HLSW99]). The approach in this paper is based on the slightly more gen- 
eral notion of a protoconcept: A protoconcept of a formal context K := (G, M, I) 
is defined as a pair (A, B) with A C G and B C M such that A' = B” , 
which is equivalent to B' = A”. The protoconcepts of a formal context K may 
be understood as those formal concepts {A, B) of subcontexts of K for which 
the same formal concept of K is derived from A and from B by applying the 
derivations of K, i.e. (A", A') = {B',B"). Thus, a formal concept (A,B) of 
a subcontext is a protoconcept of the whole context if and only if it extends 
uniquely to a formal concept (G, D) of the whole context for which G x D is 
smallest extent-intent-product containing A x B, i.e. the protoconcept (A, B) 
represents the essential information about the formal concept (G,D). The no- 
tion of a protoconcept is therefore helpful for understanding which conceptual 
information carries over from a formal context to some of its contextual exten- 
sions. An extreme case is the extension of a formal context (G, M, I) to the 
context (GU{h}, MU{n}, /U(G x {n})U({/i} x M)U{(h,n)}) because each for- 
mal concept of (G, M, I) is a “proper” protoconcept of the extended context and 
each formal concept of the extended context, which is neither the smallest nor 
the greatest, is the unique extension of a formal concept of (G, M, I). 

Basic definitions and results concerning algebras of protoconcepts are pre- 
sented in the next section. The considered operations on the set of all protocon- 
cepts of a formal context are meet, join, negation, opposition, nothing, and all. 
These operations give rise to algebraic structures: the so-called double Boolean 
algebras. A main result of this paper, proved in Section 3, states that each dou- 
ble Boolean algebra is embeddable into some algebra of protoconcepts. This 
representation theorem has as consequence that the equational axioms of double 
Boolean algebras determine the equational theory of the algebras of protocon- 
cepts (in analogy to the wellknown result that the equational axioms of Boolean 
algebras determine the equational theory of powerset algebras). 
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2 Algebras of Protoconcepts 

The set fP(K) of all protoconcepts of a formal context K := (G,M,I) is struc- 
tured by the generalization order C which is defined by 

(Ai, B\) C {A 2 , B 2 ) Ai C A 2 and B\ D B 2 - 

In general, this order does not yield a lattice structure. However, there are op- 
erations of logical nature on fp(K) which can be defined as follows: 

(Hi, Bi) n (A2, B2) := (Hi n H2, (Hi n H2)') 

(Hi, Bi) U (H 2 , B2) := {{Bi n B2)' , Bi n B2) 

^{A,B) := (G\H,(G\H)0 
-(H,H) := {{M\By,M\B) 

T := (0,M) 

T:= (G,0) 

The set fP(K) together with the operations n,U,-i,^, T, and T is called the al- 
gebra of protoconcepts of K and is denoted by fP(K); the operations are named 
“meet”, “join”, “negation” , “opposition” , “nothing”, and “all”. The meet is de- 
termined by the intersection of object sets (extensional conjunction) and the 
join by the intersection of attribute sets (intensional conjunction); the negation 
is determined by the complement of an object set (Boole’s contrary class) and 
the opposition by the complement of an attribute set (Boethius’ opposita con- 
traria); the constants nothing and all represent the extreme states of an object 
set (contradictory and universal state). 

The logical meaning of five of the defined operations should be clear, but the 
opposition needs further explanations: Already Aristotle discussed the oposition 
in his “Organon”, generally, and distinguished between four kinds of opposi- 
tion for which Boethius introduced the following Latin terms: “opposita rela- 
tiva” , “opposita contraria” , “habitus et privatio” , “opposita contradictoria” (cf. 
[BM74]). The negation is defined according to the meaning of “opposita contra- 
dictoria”, while the opposition may be understood as an operational formaliza- 
tion of the meaning of “opposita contraria” as discussed in philosophical logic 
until today. In [KL67], for instance, predicates P and Q are called “contrary” if 
P{x) always implies -•Q{x)-, this condition analogously holds for protoconcepts 
(A,B) and -'(H, H) of formal contexts (G,M,I) with M' = 0. The special case 
of “polar-contrary” opposition (also discussed in [KL67]) may occur especially 
in formal contexts (G, M, /) for which M consists of dichotomic attribute pairs 
mj and Uj {j G J), namely as contrary pairs of protoconcepts with the attribute 
sets {mj I j G Jo} U [nj | j G J \ Jo} and [uj \ j G Jo} U [mj | j G J \ Jo}. 

The Pythagoreans considered the oppositions as the origin of every being. 
The pre-Socratic philosopher Empedocles incorporated Pythagorean ideas in his 
philosophy and systematized the Pythagorean conception of the oppositions by 
tracing them back to the two oppositions moists dry and cold-ir^warm. By these 
oppositions, Empedocles explained especially the doctrine of the four elements, 
as described by the formal context in Figure 1; Figure 2 shows the ordered set 
of all protoconcepts of the formal context in Figure 1. 
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Fig. 1. A context and its concept lattice representing Empedocles’ conception of the 
four elements 




Fig. 2. The algebra of protoconcepts of the formal context in Figure 1 



Clearly, the opposition of the concept “water” yields the concept “fire” (and 
vice versa), while the opposition of the concept “earth” yields the concept “air” 
(and vice versa). To read this from the diagram in Figure 2, one has to un- 
derstand how to determine the object and attribute set of a protoconcept rep- 
resented by a particular circle in the diagram: those objects (resp. attributes) 
belong to the protoconcept whose names are written on circles on a path of line 
segments descending (resp. ascending) from the circle representing the proto- 
concept (notice that, for some protoconcepts, the object resp. attribute set is 
empty). 
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The protoconcepts of the formal context in Figure 1 have the special prop- 
erty that one of their sets is the derivation of the other. This property also holds 
for all protoconcepts which result from an operation in the algebras of protocon- 
cepts as the above definitions show. It is useful to conceptualize this property: A 
semiconcept of a formal context K := (G, M, I) is defined as a pair (A, B) with 
A C G and B Q M such that A' = B or B' = A. Obviously, each semiconcept 
is a protoconcept. For a better understanding of the structure of the protocon- 
cept algebra iP(K), we consider the two types of semiconcepts collected in the 
following two sets: 

*P(K)n := {{A, A') \ A C G} and ^{K)u := {{B' , B) \ B C M}. 

Then ^ 3 (K) := fp(K)n Ufp(]K)u is the set of all semiconcepts of K which, together 
with the restrictions of the operations FI, U, -i, T, and T is a subalgebra of 
fP(K), denoted by ,Q(IK). Furthermore, iB(K) = fp(K)n H fp(K)u, and (Q5(]K),tI) 
is the concept lattice of K in which the operations □ and U yield the infimum and 
the supremum of two formal concepts, respectively. For the structural analysis 
of iP(IK), it is useful to define additional operations on iP(K): 

yU t) := -'(-'j: FI -itj) and yF := ~'(^y U -'t)), 

T := and! :=^T. 

Obviously, ip(K)n together with the restrictions of the operations F, U , T, T 
is a Boolean algebra fp(K)n, isomorphic to the Boolean algebra of all subsets of 
G; dually, ip(K)u together with the restrictions of the operations F ,U,-', T ,T 
is a Boolean algebra ip(K)u, antiisomorphic to the Boolean algebra of all subsets 
of M. The order relation on ip(K) agrees with the derivable order relations of 
the two Boolean algebras ip(lK)n and ip(]K)u. 

Theorem 1 The following equations are valid in iP(K).- 

la) (a;Fa;)Fy = xUy lb) {xUx)Uy = xUy 

2a) xU y = yU X 2b) x U y = y U x 

3a) xn{yU z) = (a; F j/) F z 3b) xU{yU z) = {xUy)U z 

4a) a;F(a;Ut/) = xUx )b) xU{xUy) = xU x 

5a) xU {xU y) = xU X 5b) xU {xr\ y) = xU x 

6a) X F (j/ U z) = (x F j/) U (x F z) 6b) x U (y F z) = (x U y) F (x U z) 

7a) -i-i(x r\y) = xFly 7b) -‘-‘{x U y) = xUy 

8a) -i(x F x) = -ix 8b) -'(x U x) =^x 

PajxF-ix=T 9b) xU-‘x=T 

10a) =TnT 10b) ^T=TUT 

11a) -.T=T 11b) ^T=T 

12) (x F x) U (x F x) = (x U x) F (x U x) . 

Proof: The equations of the theorem can be easily verified in algebras of pro- 
toconcepts. We only demonstrate this by proving 12): Let x := (A,B) G iP(K). 
ThenxFx= (A, A') and xUx = (S', i?). It follows that (xFx)U(xFx) = (A", A') 




Boolean Concept Logic 323 



and {x U x) V^ {x U x) = (S', B”). By the definition of a protoconcept, we have 
A' = B” and B' = A". Hence {A” ,A') = {B' ,B”), i. e., the equation 12) holds 
in^(K). □ 

The question arises whether the equations of Theorem 1 are enough for de- 
termining the equational theory of algebras of protoconcepts, i.e., whether each 
equation valid in all algebras of protoconcepts can be entailed by the equations of 
Theorem 1. This is proved in the next section by investigating so-called “double 
Boolean algebras”. 

3 Double Boolean Algebras 

For investigating the equational theory of algebras of protoconcepts, we intro- 
duce the notion of a double Boolean algebra which is an algebraic structure 
^ := (Zl,n,U,-i,-', T,T) of type (2, 2, 1, 1, 0, 0) satisfying the equations la) to 
11a), lb) to 11b), and 12) of Theorem 1, where the further operations are defined 
as in Section 2 by 

xU y ■= FI -ly) and a; H y := -‘{-‘x U ^y), 

T := andT :=^T. 

In particular, each algebra of protoconcepts is a double Boolean algebra. Alge- 
bras of semiconcepts additionally satisfy the following condition: 

13) xrix = a;ora;Ua; = a;. 

A double Boolean algebra satisfying the condition 13) is called pure, because 
it only consists of the two subsets £>□ '■= {x & D \ xHx = x} and £>□ := {x £ D \ 
xlAx = x} which both carry a Boolean structure. A detailed investigation of the 
structure of algebras of semiconcepts and double Boolean algebras is presented in 
[HLSW99]. Here we concentrate on questions concerning the equational theory 
of algebras of protoconcepts. 

For introducing an order on a double Boolean algebra, we imitate the order 
definition in Section 2: 

X Qy xU y = xU X and a; U y = y U y 



Lemma 1 In a double Boolean algebra the following conditions hold: 

(1) X n y ^ X ^ X Li y, 

(2) the mapping x ^ xUy preserves Q and □, 

(3) the mapping x ^ xUy preserves Q and U. 

Proof: (1): 2a), 3a), and la) yield {xUy)nx = yU{xVix) = y □ (yfl (xFla;)) = 
(a;ny)n(xny). By 2b) and 4b) it follows {xr\y)Ux = xUx. Hence xViy C x. Dually, 
we obtain x xLi y. (2) and (3) follows straight forward. Let us only mention 
that xQy implies {xVi z)U{yn z) = {xnxUz)U{ynz) = (a; □ y □ z) U (y □ z) = 
(y n z) U (y n z) by 4b). □ 
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For a double Boolean algebra D := {D, FI, U, _L, T) let x^^ := xHx and 
Xu '■= xU X. We obtain that Dn = {xu \ x G D} and Du = {xu | x G D}, 
and that := (_Dn,n,U ,-i,_L,T ) and •= (^u,n ± ,T) are Boolean 

algebras. 

For proving that each double Boolean algebra can be quasi-embedded into 
the algebra of protoconcepts of a suitable context, we define the notion of a filter 
and an ideal of a double Boolean algebra D;. A subset F of 7^ is called a filter if 
x € F and x Q y in I) imply y € F and if a; G F and y € F imply x n y G F; 
an ideal of D is dually defined. A subset Fq is called a base of the filter F if 
F = {y G D \ X \Gy ior some x G Fq}; again, a base of an ideal is dually defined. 

Lemma 2 Let F be a filter of a double Boolean algebra Ff. 

(1) F C\ Du and F fl Du are filters of the Boolean algebras Fr S-u> 
respectively. 

(2) Each filter of the Boolean algebra Ffu « bo.se of some filter of D; 
in particular, F fl Du is a base of F. 

Proof: Since the restrictions of □ and C to Du are the meet operation and the 
order relation of the Boolean algebra Du ) F Fl Du is obviously a filter of ID_u ■ 
F n Du is a filter of the Boolean algebra F,j because xHy xn y for arbitrary 
x,y G Du, namely a; □ y is a lower bound of x and y by Lemma 1(1) and xH y 
is the greatest lower bound of x and y since the restriction of C to Du is the 
order of the Boolean algebra Fy. Now, let F be a filter of Fr- Fo'' E Vi and 
X 2 E 2/2 with Xi,X 2 G E, we obtain Xi fl X 2 E Fl y 2 E 2/i Fl 2/2 by Lemma 1(2); 
hence {y G D \ x y ior some x G F} is a filter in F with F as a base. For 
y G F we have that y □ y G F fl Fr and y Fl y E 2 / by Lemma 1(1). Thus, F Fl Fr 
is a base of F. □ 

Let dp{D) be the set of all filters F of the double Boolean algebra F for 
which F Fl Fr is a prime filter of the Boolean algebra Fr> and let 3p(F) be 
the set of all ideals J of F for which I Fl Fr is a prime ideal of the Boolean 
algebra Fy (for the definitions and results concerning prime filters and prime 
ideals see [DP92], p.l85ff.). Now, we define the standard context for a double 
Boolean algebra ID. by 



K(F) := (^p(F),3p(F),A) 

where FA/ F Fl / 0 for F G dp{D) and I G Jp(ID)- For x G D, let 

S'x := {F G S'p(D) I a: G F} and let 3^: '■= {I G 3p(ID) | a; G /}. 

Lemma 3 The derivations in K(F) yield: 

(1) d'x = '^x = '3xu for all X G Fr, 

(2) 3'y =dy = dyn foT all y G Du, 

(3) 1?; = 3^^ = 3^^,^ and 3'^ = for all zG D. 

Proof: (1): Let x G Fr and let I G 3x- Then x G F (3 I for all F G S'a,; hence 
I G Conversely, let / G Suppose x ^ I. Then / Fl Fr is an ideal of Fr> 

by the dual of Lemma 2(1), and x G Fr \ I. The ideal I Fl Fr is contained in 
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a prime ideal of the complement of which in Dn is a prime filter E of Cn 
containing x. By Lemma 2(2), F := {y & D \ w Q y ior some w G E} is a filter 
of D with F n £>n = hence F G^x- But F fl / = 0 which contradicts I G^'x- 
Thus, X G I and so / G 3x- This proves ^'x = 3 x- The corresponding equality in 
(2) follows dually. (3): Let z G D. Since we obtain by 

(1). The second half of (3) follows dually. □ 

The equivalence relation □ on a double Boolean algebra D defined by xOy 
X Q y and y Q x (<t^ Xn = j/n and Xu = yu) has as restriction the identity on 
Dp := Fn U Fy, which is the largest pure subalgebra of D. Especially, □ is a 
congruence relation of F because, for each proper algebraic term t{x \, . . . , x„), 
the relationships aiD6i, . . . , a„D5„ even imply t(ai, . . . , a„) = t{b \, . . . , 6„). For 
formalizing the desired representation theorem we need the following notion: 
a map a between quasi-ordered sets is called quasi-injective if it satisfies the 
equivalence x Qy a{x) □ cx{y). 

Theorem 2 Let D he a double Boolean algebra. Then x i— >■ (Sx,3x) describes 
a quasi-injective homomorphism l from D to ip(K(F)) having □ as kernel, and 
each map (3 : i(F) — > D satisfying t/3(j:) = i is an injective homomorphism 
from the image of l into the double Boolean algebra F. 

Proof: By Lemma 3, we obtain = 3'f and hence (dx,3x) is a 

protoconcept of K(F) for all x G D. For x^ % i/n in Fr there exists always 
an F G S"p(F) with Xn G F but yn ^ F ; hence ^x = S'xn ^ Syn = S'y and 
so (dx,3x) yf i^y,3y). Such inequality can be dually obtained for yu % a^u in 
Fy If xn E Un and yn E a;n, then a;n = yn and Xu = yu\ hence {^x,3x) = 
(Sy,3y). Therefore x i— describes a quasi-injective map t from F to 
fp(K(F)) having □ as kernel. It is even a homomorphism because we can show 
that ^xr\y — Ll ^ yi *^x\Jy — n 3y, ^^x = ^p{D) \ ^X, and lT_i^ = 3p{D) \ 3x- 
These equalities result from the following equivalences and their duals: F G 
^xHy ^ X n y G F x,y G F F G ^x <3^y and F G ^^x ^ -'X G F ^ 
-■(xn) GEg^xu^Fg^x^Fg^Fg dpiH) \ ^x- Now, let /? be a map from 
i(F) into F satisfying = y. Obviously, /3 is injective and even bijective 

on i{Dp). Since the operations of F always result in elements of Dp, we obtain 
f3L{x) n (3i{y) = I3i{f3i{x) n (3i{y)) = (3{Ll3i{x) n i(3i{y)) = P{i.{x) n r(j/)) and, 
analogously, the compatibility conditions for the other operations; hence ft is an 
injective homomorphism. □ 

By Theorem 2, if tiixi, . . . ,Xn) = t 2 {xi, . . . , Xn) is a non-trivial equation 
valid in all algebras of protoconcepts, then, for oi, . . . ,a„ G F, we get ti(t(ai), 
. . . , i(a„)) = t 2 {t.{ai ), . . . , 6(a„)) and even 

ti(ai, . . . ,a„) = ti{Pi{ai),...,Pt{a„)) = t2{(3i{ai), . . . , (3i{an)) = ^ 2 ( 01 , ■ • • , On) 
if in additon (3L{ai) = Oi for i = l,...,n, the concequene of which is that 
OiDaj has to imply Oi = aj. Since OiDcij (i < j) generally yields the equal- 
ity tk{ai,...,ai,...,aj,...,an) = tfe(ai, . . . , a^, . . . , a^, . . . , a„) for A: = 1,2, the 
equation . . . , x„) = t 2 {x \, . . . , Xn) is valid in all double Boolean algebras 

too. Thus, as a corollary of Theorem 1 and Theorem 2, we obtain the following 
basic result of Boolean Concept Logic: 
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Corollary 1 The equational axioms of double Boolean algebras generate the 
equational theory of the algebras of protoconcepts. 

For a finite double Boolean algebra D, the elements of are just the 

principal filters [a) := {x G D \ a Q x} where a is an atom of the Boolean algebra 
and the elements of '3p{lD) are just the principal ideals {c\ := {x £ D \ x ^ c} 
where c is a coatom of the Boolean algebra Du', furthermore, [a)A(c] a C c. 
Therefore, the formal context {A{Duf},C{Du),Q), for which A{Du) is the set 
of all atoms of Qu and C{Du) is the set of all coatoms of !?□, can be viewed 
as a simplified version of the standard context of D. Substituting this simplified 
version in Theorem 2 yields the following corollary: 

Corollary 2 Let ]J be a finite double Boolean algebra. Then x i— ({a G A{Du) \ 
a E x}, {c G C{Du) I X C c}) describes a quasi-injective homomorphism i from 
If to fp(Gl(,D|-|), C{Du), E) which maps Dp bijectively onto Sj{A{lfpf) , C (Du) , E)- 

4 Concept Algebras 

As already discussed in Section 1, within a concept lattice $(IK) with K := 
(G, M, /), there is only a weak negation of formal concepts (A,B) defined by 
-iu(A, _B) := ((G \ A)", (G \ A)'), which equals (-■(A, i?))u in ^(K); dually, we 
also define a weak opposition by ^n(A, B) := {{M \ B)' , {M \ B)"), which equals 
(-■(A, B))n in tp(K). It follows that AU(G\A)" = G, but An(G\A)" might not 
be disjoint and, dually, that B n (M \ B)” = 0, but B U (M \ B)” might not be 
equal to M . In case the incidences gim in K mean that the object g may have 
the attribute m, the elements of An(G\ A)" may be interpreted as those objects 
which may or may not have all attributes of B and the elements of i? U (M \ B)'' 
may be interpreted as those attributes which all objects of A may or may not 
have. The concept lattice jB(K) together with the weak negation -ly and the 
weak opposition as unary operations and the constants T and T as nullary 
operations shall be viewed as the algebra 2l(K) := (®(K), FI, U, T , T ), 
called the concept algebra of the formal context K. 

To obtain a better structural understanding of concept algebras, we intro- 
duce the notion of dicomplemented lattices by the following definition: A di- 
complemented lattice is defined as an algebra L := (L, A,V,"^ ,0, 1) of type 

(2, 2, 1, 1, 0, 0) for which (L, A,V,0,1) is a bounded lattice and the following 
conditions hold: 

14a) x‘^^ < X 14b) X < 

15a) {x V y)'^ < 15b) < (x A y)^ 

16a) (x A y) V (x A y"^) = x 16b) x = (x V y) A (x V y^) 

17a) w A x^ A yf A ■ ■ ■ Ayf^ < X and A x^ A yf A ■ ■ ■ A yf^ < yi 

imply y^ A ■■■ A y^ < x for each i G {1, . . . , n}. 

17b) w V x^ V V • • • V > x and V x^ V V • • • V > yi 
imply V • • • V 2 /^ > x for each i G {1, . . . , n}. 
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The conditions 14), 15) and 16) present (up to equivalence) equations, but 17) 
yields only “quasi-equations” where a quasi- equation is defined as an implication 
between equations. Thus, the dicomplemented lattices form a quasi-equational 
class of algebraic structures. Clearly, x !-->■ is an interior operator on the 
underlying set L by 14a), 15a) and, dually, x !->■ x^^ is a closure operator on L 
by 14b), 15b); in particular, also by 16), 

=x^ <x^ = x^^^ and < x^^ < x< 

Furthermore, 16a) and 16b) yield x V x"^ = 1 and x A x'^ = 0, 0"^ = 1 and 
1^ = 0, respectively. The pair x* := (x^,x^) is called the dicomplement of 
X. Every bounded lattice can trivially be made to a dicomplemented lattice by 
defining (1,1), (0,0), and (0,1) as the dicomplements of 0, of 1, and of each 
element unequal 0 and 1, respectively. Less trivial examples are given by finite 
distributive lattices in which x^ and x^ are chosen as the pseudo-complement 
and the dual pseudo-complement of x. The algebraic structures of both classes 
can be verified as examples of dicomplemented lattices by the following theorem 
using as formal contexts (L,L,<) for bounded lattices L and {J{D),M{D),<) 
for finite distributive lattices D. 

Theorem 3 For each formal context K := {G,M,I), the concept algebra 2t(K) 
:= (iB(K), n, U, -ly, -'n, T , T ) is a dicomplemented lattice. 

Proof: Clearly, (iB(K), □, U, T ,T ) is a bounded lattice. The conditions 14a), 
14b), 15a), and 15b) follow directly from the definition of the unary oper- 
ations. For proving 16a) we consider (A,B),{C,D) G i8(]K); for the extent 
of {{A,B) n (C,D)) U {{A,B) n ->u{C,D)) it can be easily seen that ((A fl 
C) U (A n (G \ G)"))" C A and ((T n G) U (^ n (G \ G)"))" 2 ((A n G) U 
(Al n (G \ G)))" = A. This yields the desired equality. For proving 17a) we 
consider (A, B), (C, D), (E^, Fi), . . . (E„, F„) G ^(IK) with -iu(F;i, Gi) A • • • A 
~'u(£'n, Fn) ^ (G, D). Then, for each i G {1, . . . , n}, there exists a, gi G Ei with 
gi ^ C and gi ^ Ei because Ei C C would imply Ei C C which contradicts 
Gi" n • • • n En" C. Since gi G G”, we obtain gi G G” fl Gi" fl • • • fl G„" and 
hence, hy gi G A or gi G A , the alternative gi G AC\C fl Gi fl • • • fl G„ or 
gi G A n G n Gi n • • • n G„ . Because of gi ^ C and gi ^ Ei, we can now 
conclude that {A, B) A -■□(G, G) A -'u(Gi,Gi) A ••• A -'u(G„,G„) ^ (C,D) or 
-■u(7l, B) A ~'n{C,D) A -'u(Gi, Gi) A • • • A -•u(G„,G„) ^ (Gj, G^). Thus, 17a) is 
proved for concept algebras. 16b) and 17b) follow dually. □ 

Conversely, it can be shown that each dicomplemented lattice can be em- 
bedded into the algebra of protoconcepts of some suitable context. For the con- 
struction of those contexts, we need the notion of a primary filter and a primary 
ideal in a dicomplemented lattice L := (G, A,V,"^ ,0, 1): A subset G of G is 

called a filter of G if x G G and x < y imply y G F and if x G G and y G F imply 
X Ay G F; an ideal is dually defined. A subset Gq is called a base of the filter G if 
G = {y G G I X < y for some x G Gg}; again, a base of an ideal is dually defined. 
A filter G of G is called primary if there is an ideal I and an element x of G 
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with ^ I such that F is maximal under all filters which are disjoint from / 
and do not contain x, but contain w or for all w £ L. The set of all primary 
filters of L is denoted by ^pr{L). Dually, primary ideals of L are defined; the set 
of all those ideals is denoted by 3pr{L). The desired context for representing L 
can now be defined by K{L) := {^pr{L),'3pr{L), A) with FAI F C\ I = 0; 
K(T) is called the standard context of the dicomplemented lattice L. For x £ L, 
let '■= {F G dpr{L) I X G F} and 3^ '■= {I G 3pr{L) | x G /}. We prepare the 
desired representation theorem by three lemmas: 

Lemma 4 For elements x and y in L the following conditions hold: 

(1) X A <y^x<y 

(2) x^ < y ^ y"^ < X 

Proof: (1): Let xAy^ < y. Together with 16a) it follows x = (x Ay) V (xA = 
X A y and hence x < y. The converse is obvious. 

(2): 14a) and 15a) yield the following equivalences: x^ < y ^ y^ < x"^‘^ 

y^ < X. □ 

Lemma 5 For any ideal I and any element x £ L \ I with x"^ ^ I, there exists 
a filter F with F fl ({x} U 7) = 0 and G F for all w £ L\F . 

Proof: Let / be an ideal and x £ L \ I with x^ ^ I. For S C L we generally 
define := | s G S'} and S— := | T is a finite subset of S|. Since 

{yW z)^ <y^ A by 15a), I—) is a base of a filter. This filter is disjoint 

from / and does not contain x because y'^ < z for some y, z £ I would imply 
1 = y V £ I which contradicts x^ ^ I, and y^ < x for some y £ I would 
imply x'^ < y by Lemma 4(2) which also contradicts x^ (f I. 

In general, for each J C L, we define the filter Fj := {y £ L \ y > 
z for some z G J— } and jj(x, I) to be the set of all those filters Fj which satisfy 
{x}Ul C J and ({x}U/)nFj = 0; furthermore, let 3(x, I) be the set consisting of 
all such J with Fj £ j?(x, I). By Lemma 4, for y,z £ I, x^ Ay^ < x would imply 
y^ < X and hence x^ < y £ I and x‘^ Ay'^ < z would imply x'^A(yVz)"^ < y'd z 
and hence x^ < y\/ z £ I which both contradict the assumption x^ (f I. There- 
fore, y G ({x| U /) n F(^[x}ui) = 0 hence {x| U / G Z{x, /). Thus, 3(x, I) is 
not empty. Let € be a non-empty chain in (3(x,J), C). It can be easily checked 
that F|j(j G 5^(x,/). Thus, by Zorn’s Lemma, there exits a maximal set J in 
(3(x,l),c). 

Let w £ L and let T be a finite subset of J. Because of A ^ x, we obtain 
by 17a) for each y £ Y the alternatives 

A x'’^ A A ^ X or Ax^ A /\ ^ y) and 

A x^ A /\Y^ ^ y or Ax^ A f\Y^ ^x) 

Suppose Ax^ A f\ Y^ < x. Then Ax^ A /\ Y"^ y for each y £Y . We 
also have Ax^ A (\ Y"^ -fi. x, because otherwise it would follow by 16a) that 
x^A/\ Y^ = (w^Ax"^A/\ Y‘^)'d{w^"^Ax"^A/\ Y'^) < x and hence A < x by 
Lemma 4 which contradicts x ^ Fj.liwe replace Y by any finite subset Y of J, 
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then the inequality and the two negated inequalities have still to be true; hence 
A !\ ^ I for all finite subsets Y of J. Furthermore, A /\ Y^ ^ 

for each finite subset Y of J, because otherwise, by Lemma 4(1), we would have 
/\Y^ < which contradicts A /\Y^ = w'^ A A /\Y^ < x. Thus, 
(J U {tC^}) n = 0 and x ^ Now, the maximality of J yields 

G J and hence G Fj. Analogously, A x^ A f\Y‘^ < x leads 

to w‘^ G Fj. Finally, we have to consider the case: A x‘^ A /\Y"^ ^ x and 

w^"^Ax^A/\Y^ ^ X for all finite subsets Y of J. Suppose Ax"^ Af\Y^ < yi 
and Ax"^ A /\ Y"^ < for some finite subset F of J and yi , 2/2 G Ffll; then 
we would get x'^A/\Y‘^ = {w^ Ax‘^ A f\Y‘^)y Ax'^ A /XY'^) < i/iVy 2 G I 
which contradicts Fj fl ({a;} U /) = 0. Thus, without loss of generality, we can 
assume that A x^ A /\ Y^ ^ / for all finite subsets Y of J. Consequently, 
^{w^uJ U /) = 0. The maximality of J yields again G J and hence 

g Since w £ L \ Fj implies G L \ Fj, we finally obtain that 

G Fj for all w G L \ Fj. □ 

Lemma 6 The derivation in K(L) yields = 3x and {^pr{L) \ ^x)' = for 
all X £ L. 

Proof: Let I £ 3x- Then cc G F fl / for all F G hence I £ ^x- Conversely, 
let I £ 3pr{L) \ 3x- Lemma 5 guarantees the existence of a filter F disjoint from 
I and not containing x^, for which G F for all ic G L\ F; consequently, 
X £ F. By Zorn’s Lemma, there exists a maximal filter F disjoint from / and not 
containing x‘^, but containing F. Obviously, F G jJa,; hence / ^ 'S'x- This proves 
'^'x = 3x- Now, let I G 3x-^- Since any primary filter F with x ^ F contains 
by definition, we obtain / G {Spr{Ld \Sx)'- Conversely, let I ^ 3x^- Then, by 
Lemma 5, there is a filter F with Fn({x}U/) = 0 and G F for all w £ L\F. 
Therefore, a maximal filter disjoint from / and not containing x, but containing 
F, which exists by Zorn’s Lemma, must be primary; hence / ^ {^pr(L) X^x)'- 
This proves the second claim. □ 

Theorem 4 For a dicomplemented lattice L, the map x 1 — >■ {Sx,3x) is an injec- 
tive homomorphism l from L into 2t(K(F)). 

Proof: By Lemma 6 and its dual, (Ua;,(5a;) is a formal concept for each x £ L. 
For X ^ y in F, by Lemma 5 and Zorn’s Lemma, there exist always an F G jJa, 
with y ^ F; hence ^x ^ dy and so {^x,'3x) {^y,3y). Therefore x H> (Sx,3x) 

describes an injective map from L into ^(K(F)). It is even a homomorphism 
because we can show that ^xAy = 3xvy = '3x<33y, ^x^ = {^pr{L)\^x)" , 

3x^ = (Ilpr(T) \Ux)", (l?o,3o) = (0,3*pr(T)), and (S'i,3i) = (l?pr(F),0)- These 
equalities result from the following equivalences, equalities, and their duals: F G 
^xAy X A y £ F 'v=r' x^y £ F F G ^x^ ~ {3x^) ~ {^pr (F) \^x') ? 

and = 0- C 

Corollary 3 The quasi- equational axioms of dicomplemented lattices generate 
the quasi- equational theory of the dicomplemented lattices. 
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For a finite dicomplemented lattice L, the elements of dpr{L) are just the 
principal filters generated by a V-primary element and the elements of 3pr{L) 
are just the principal ideals generated by a A-primary element; at this, an element 
a yf 0 of L is called V-primary if a covers at most two elements and if x ^ a 
always implies > a, and an element a yf 1 of L is called A-primary if a is 
covered by at most two elements and if x ^ a always implies < a. Every 
V-irreducible element is V- primary by 16a), and every A-irreducible element is 
A-primary by 16b). The set of all V-primary elements of L is denoted by Jpr{Ld, 
and the set of all A-primary elements of L is denoted by Mpj.{L). 

Corollary 4 Let L be a finite dicomplemented lattice. Then x i-A- ({a € Jpr{L) \ 
a < x}, {c G Mpj.{L) I X < c}) describes an isomorphism from L onto the concept 
algebra %fJpr{L), Mpr{L), <). 

5 Further Research 

This paper can only be considered as the beginning of developing Boolean Con- 
cept Logic. After identifying double Boolean algebras and dicomplemented lat- 
tices as basic algebraic structures for Boolean Concept Logic, the word problems 
for those two classes of algebras should be attacked which, in particular, sets 
the task of determining the free double Boolean algebras and the free dicom- 
plemented lattices. In this scope, the question arises whether there are useful 
normal forms for algebraic terms concerning double Boolean algebras and di- 
complemented lattices. Solutions to those problems and tasks will lead to fur- 
ther questions about the algorithmic treatment concerning the solutions. For the 
logical analysis of data contexts, a developed structure theory of algebras of pro- 
toconcepts and concept algebras would be desirable. In particular, constructions, 
decompositions, and interesting properties should be investigated for those al- 
gebras. A main direction of research will be determined by the needs concerning 
a successful development of Contextual Logic; in particular, the development 
of Contextual Judgment Logic based on concept graphs (see [Wi97a], [Wi98], 
[Pr98], [PW99], [MSW99]) would benefit from substantial results in Boolean 
Concept Logic. A measure for successful research will always be given by the 
usefulness of its results for conceptual knowledge representation and processing. 
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Abstract. Triadic concept graphs have been introduced as a mathema- 
tization of conceptual graphs with subdivision. In this paper it is shown 
that triadic concept graphs of a triadic power context family always form 
a complete lattice with respect to the generalization order. For stating 
this result, a clarification of the notion of generalization is needed. It 
turns out that the generalization order may be differently defined, de- 
pending on the assumed background knowledge, respectively. 
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1 Triadic Concept Graphs 

The aim of this paper is to show that the triadic concept graphs of a triadic 
power concept family (see [Wi98]) always form a complete lattice with respect 
to the generalization order (cf. [PW99]). For this we have to clarify the notion of 
generalization for triadic concept graphs which, of course, presupposes a basic 
understanding of triadic power context families and their conceptual structures. 
Basic definitions and results of Dyadic Goncept Analysis may be found in [GW96] 
and of Triadic Goncept Analysis in [LW95],[Bi98],[WZ99]. 

Before recalling the definition of a triadic power context family and its triadic 
concept graphs, we start with an example from music concerned with the “logic” 
of the common musical notation. Figure 1 shows the notation of a simple melody 
modulating from g major via d major to by major. The added boxes and circles 
yield a conceptual graph with a subdivision indicating the domains of the three 
keys. The notes without accidentals are references, the accidentals jl, and b 
combined with a key are concept names ([] may be added in all concept boxes 
without an accidental), and the intervals ±2nd (minor second), ±2ND (major 
second), ±3rd (minor third), ±3RD (major third), and ±4dh (fourth) are the 
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Fig. 1. The notation of a modulating melody represented as a conceptual graph 



connecting binary relations. The relationships between notes, accidentals, keys, 
and intervals can be coded in formal contexts as follows (for simplifying the 
coding, we consider notes and intervals modulo octave): 

The notes d, e, /, g, a, 6, c are represented by the numbers 0, 1, 2, 3, 4, 5, 6, the 
accidentals t>, tl,(| by —1,0,1, and the major keys c\;, f,c, g,d,a, 

e, by — 7, . . . , 0, . . . , 7, respectively. We define a triadic context Kg := 

(Go, Mq, B, Fq) by 



Go := {0,1,2,3,4,5,61, Mg := (-1, 0, 1}, B := (-7, . . . , 0, . . . , 7}, and 



(i,j,k) G Fo 



k £ B \ (—7, 7}, j = 0, and 

i = k — 4n{mod7) with n G (—5, . . . , 0, 1}, 
k £ B \ {0}, j = k : and 

i = j ■ {2 + 4n){mod7) with n G (0, . . . , |fc — 1|}. 



(i,j,k) G F is read: the note i has the accidental y in the key k. A triadic 
diagram representing the concept trilattice of Kg is shown in Figure 2 where 
each black spot in the triangular net together with the linked spots to the right, 
the left, and above represent a triadic concept with its extent of notes, intent 
of accidentals, and modus of keys, respectively (triadic diagrams are explicitely 
described in [LW95],[Bi97]). 

Now, we form a second triadic context for including musical intervals into 
our considerations. The (upward) intervals are represented by pairs of numbers 
as follows: 



1st 2nd 2ND 3rd 3RD 4th 4th^ 5th 6th 6TH 7th 7TH 

(0,0) (1,1) (1,2) (2,3) (2,4) (3,5) (3,6) (4,6) (4,7) (5,8) (5,9) (6,10) (6,11) 



The set of these pairs of numbers shall be denoted by M 2 . For j G Gg, let 
Ij '■= 2 ■ j a 0 < j < 1, Ij := 2 ■ j — 1 if 2 < j < 5, and Iq := 10; clearly, (0, Ij) 
represents just the interval from the note 0 to the note j in the key 0 (i.e. c 
major). The second triadic context is now defined by 



IK2 : — (Gg X Gg, M 2 , B, F 2 ) with 

Y 2 ■■= - i{mod7),lj_4k{mod7) ~ k-4k{mod7){modl2)),k) \ 

i,j £ Gg and k £ B}. 
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Fig. 2. A triadic diagram of the concept trilattice of the musical notation context 



The formal contexts Kg and K 2 form a triadic power context family which codes 
the basic relationships of the notation of tonal music. It allows, for instance, to 
represent the conceptual graph of Figure 1 as a mathematical object, i.e. a triadic 
concept graph. To make this understandable in general, we discuss in this paper 
basic relationships in triadic power context families and between their triadic 
concept graphs. 
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A triadic power context family of triadic contexts is a family K ^ (Kg, . . . , IK„) 
(n > 2) of triadic contexts := (Gk,Mk,B,Yk) {k = 0,...,n) such that 
Gk Q (Go)*- If Gfc = 0 for some k, we may obmit from the context listing. 
The concept trilattices T(Kfe) yield the fundamental relationships in K between 
objects, attributes, modalities, extents, intents, modi, and triadic concepts as, for 
instance, represented in Figure 2 for the above defined music context Kq. The 
formal relationships on the contextual and conceptual level extend to higher 
complexity on the judgment level by the formation of the triadic concept graphs 
of K. 

For the definition of triadic concept graphs (without coreference links), first, 
an abstract concept graph with subdivision is specified as a structure 0 := 
(V, E, V, C, K, a) for which 

1. V and E are sets and v is a, mapping of E to IJfe=i (2 < n G N) so that 
{V, E, v) can be considered as a directed multi-hypergraph with vertices from 
V and edges from E (we define |e| = fc :<t^ iy(e) = (vi,. . . , Vk)), 

2. C is a set and k is a mapping of F U if to C such that K(ei) = k( 62 ) 
always implies |ei| = \c 2 \ (the elements of C may be understood as abstract 
concepts), 

3. (T is a partial mapping of the vertex set V to the power set of V x C such 
that, for the partial mappings ai and (T 2 with cri(w) \= {v & V \ (v,c) € 
a(w) for some c G C} and ct 2 {w) := {c G G | (w,c) G a{w) for some v GV} 
for w G dom{a), we have cri{V) yf 0 yf (J 2 {V) and v ^ (cti)™(u) for all u G F 
and m G N (for c G <J 2 {w), the vertex set A{c) := {v G V \ (u,c) G cr(w)} 
is called the area of c within the vertex w). Additionally, we assume that 
there is a universal vertex G V D C with a{w^) = {(w,^^) | u G \ 
o'i{w) for all w G dom{a) \ {ic^}}. 

The abstract concept graph with subdivision belonging to the conceptual graph 
in Figure 1 has as vertices the small rectangles and the large boxes (includ- 
ing the area of the whole diagram), as edges the links joining two rectangles 
via a circle, and as abstract concepts {\\,g major), (f\,{g major, d major}), 
d major), (t], {d major, b\, major}), {f, 6[, major), (j), d major), (b, b\, major), 
g major, d major, bh major, and key (as the universal “vertex-concept”); fur- 
thermore, (Ti maps a large box to the set of all rectangles it contains and (J2 
maps a large box to the abstract concept named in it. Hence, the retangles in a 
large box form the area of the corresponding abstract concept within that box. 

In this example of an abstract concept graph we already have connections 
to a triadic power context family. What that means, mathematically, shall be 
specified next: Let K := (Ko,...,IK„) be a triadic power context family with 
Kfc := {Gk, Mk, B,Yk) {0 < k < n) and let Cjg := Ufc=o2(^fc)i abstract 

concept graph 0 := {V, E, v, G, k, a) with subdivision is called a triadic concept 
graph over the triadic power context family K if 

4. C = Cg, 

5. k{V) C T(Ko), 

6. K(e) G 2(]Kfe) for all e G if with |e| = k. 




336 



Bernd Groh and Rudolf Wille 



7. (72 (w) C Go for all w G dom{a), 

8 . Cl G Ext{C2),C2 G Ext{C 3 ),...,Cm-l G Ext{Cm) for Ci,C2,...,Cm G CT2(R) 
imply Cl ^ 

■ 

A realization of such triadic concept graph © over K is defined to be a map poiV 
to the power set of Gq which satisfies, for all u G ru G dom{a), c G Gjg, and e G 
E with v{e) = (vi, . . . , Vk), allowing the abreviations p{g) := p{vi) x • • • x p{vk) 
and (e, c) G a{w) :<G> (wi, c), . . . , {vk, c) G cr{w), the conditions 

9. 0 yf p{v) C {Int{n{v)) x Mod{c))^^^ if (u, c) G cr(r(;), 

10. p{e) C (Int{K{e)) x Mod(c))(^) if (e, c) G a{w), 

11. a 2 {w) C p{w). 

The pair © := (©, p) is called a realized triadic concept graph of the triadic power 
context family K or, shortly, a triadic concept graph of K (cf. [Wi98]). 

Now, let us prove that Figure 1 represents indeed a triadic concept graph of 
the triadic power context family (Kg, K 2 ) of musical notation. Above we already 
discussed how the diagram in Figure 1 can be understood as a representation 
of an abstract concept graph which, by definition, satisfies the conditions 1, 2, 
and 3. It also satisfies 4, 5, and 6 under the assumption that k maps each small 
rectangle to the triadic concept of Kg generated by the accidental of the rectangle 
together with the keys of the large boxes containing the rectangle, and each large 
box to the triadic concept of Kg uniquely determined by the key named in the 
box (cf. Figure 2), and each link to the triadic concept of K 2 uniquely determined 
by an attribute, namely the number code of the interval named in the circle of the 
link. For confirming condition 7, we admit that each key is not only understood 
as a modality, but also as the triadic concept of Kg uniquely determined by the 
key and, following the principle of hypostatic abstraction, as a (general) object 
too; hence we assume that the keys understood as triadic concepts belong to the 
(extended) object set Gg of Kg, which yields the condition 7 and also 8. The 
realization map p has to assign to each small rectangle the number code of the 
note it contains, and to each large box the triadic concept of the key named in 
that box. Checking the conditions 9 to 11 is now straightforward. 

2 The Generalization Order 

In this section we discuss how an appropriate ordering may be defined on 
the set T(K) of all triadic concept graphs of a triadic power context family 
K := (Kg, . . . , K„) consisting of the triadic contexts K^ := (G^, B, 1^) (0 < 
k < n). For concepts the dominating ordering is the subconcept-superconcept- 
relation, also called the generalization order. Generalizing a concept means to 
select some attributes of the concept and to determine then a superconcept by 
them (cf. [Ac71]). This method cannot directly applied to triadic concepts, but 
if we proceed in selecting attributes while keeping the modalities of a triadic 
concept fixed, then we may obtain more general triadic concepts. This idea give 
rise to a generalization order on the set T(K) of triadic concept graphs which 
we capture by the following definitions: 
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For a triadic concept graph 0 := {V, E,v,C^, n,a, p) of the triadic power 
context family K, we define the conceptual content ( 7 ( 0 ) := (Co( 0 ), . . . , C'„( 0 )) 
by 



Co(®) := {{g, n{v), c) I (u, c) G a{V) and g G p{v)}, 

Cfc( 0 ) := {{g, k(c), c) | (e, c) G a(w) with w G dom{a) and g G p(e)} 

Obviously, the realization p of 0 and the conceptual content 0 ( 0 ) determine 
each other because g G {Int{K{v)) x Mod{c))^^'> is equivalent to {g,K{v),c) G 
Co( 0 ) and, for |e| = k, g G (Int^K^e)) x Mod{c))^^^ is equivalent to (<f, Ac(e),c) G 

Ofc(0). 

Now, we introduce the quasi-order < of generalization on _T(K) as follows: 
For two triadic concept graphs 0 i := {V\,Ei,i'i,C^,Ki,(Ti,pi) and 02 := 
(b2,-E2,JZ2,C'£,K2,CT2,P2), we define 0 i < 02 if, for all {g,K2(v),c) G 00(02), 
there exist {g,Vs,c) G 00(01) (s G S) with 

V 23 ({ki(^^s) I S G -S'}, c) <1 V 23 (k 2 (w), c), 

and if, for all ((51, ... , pfe), K2(e), c) G Ofc(02), there exist ((51, ... , pfe), Ki(et), c) G 
Oo( 0 i) {t G T) with 

V 23 ({'«i(et) I t G r|, c) ^1 V2s(«:2(e), c); 

if 01 < 02 we say that 02 is more general than 0 i. The triadic concept graphs 
01 and 02 of K are said to be conceptually equivalent (in symbols: 0i ~ 02) if 
01 < 02 and 02 0 i. The class of all triadic concept graphs of K which are 

conceptually equivalent to a given triadic concept graph 0 is denoted by 0 . The 
set of all equivalence classes 0 of triadic concept graphs 0 of K together with 
the order < induced by the quasi-order < is an ordered set denoted by 0(K). 

The definition of the generalization order for the triadic concept graphs of 
a triadic power context family IK uses as “background knowledge” only the tri- 
lattice structure of T(IK) but not the contextual relationships of K. This is the 
reason that the operation V23 is only applied to pairs of the form ({ki(us) | s G 
S},c) with one element on the second place, because the necessary knowledge 
of c G Go n T(Ko) may only be deduced from the image of (T2, but not from IK. 
The restriction of the “background knowledge” to the lattice structure of the 
concepts was also basic for the treatment of the dyadic case in [PW 99 ]. Notice 
that the triadic approach also covers the dyadic case: If B := {b} then the tri- 
adic concepts (Ai, A2, A3) of a triadic context (G,M,B,Y) are in ono-to-one 
correspondence to the dyadic concepts (^1,^2) of the dyadic context (G,M,I) 
with gim :<J 4 > {g,m,b) G Y. By this correspondence, V23 ({ki(ws) | s G S},c) ;$i 
V23 (k(u 2), c) translates to /\{ki{vs) | s G S'} < K2{v) (an adapted version of 
this inequality should replace the missprinted inequality in [PW 99 ] on page 407 , 
line 3 from below). 

There might be good reasons to assume as “background knowledge” the full 
contextual and conceptual structure of the triadic power context family K. This 
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may lead to a modified generalization order in the following way: For a triadic 
concept graph 0 := (V, E,v,C^, n,a, p) of the triadic power context family K, 
we define the contextual content Y{0) := (y"o(®)) ■ • ■ >h"„(0)) by 

lo(0) := [J(p(i^) X Int{K{v)) x Mod{ c) I {v, c) € a(w) with w € dom(a)), 
Yfc(0) := |^(p(e) X Int(K(e)) x Mod( c) I (e, c) € a(w) with w € dom(a)). 

Based on the notion of a contextual content, we propose a quasi-order ^ on _T(K) 
by defining 

01 02 n( 02 ) C Ffc(0i) for fc = 0, . . . , n. 

01 and 02 of K are said to be contextually equivalent (in symbols: 0i « 02 ) if 
01 02 and 02 0i- The class of all triadic concept graphs of K which are 

contextually equivalent to a given triadic concept graph 0 is denoted by 0. The 
set of all equivalence classes 0 of triadic concept graphs 0 of K together with 
the order ^ induced by the quasi-order ^ is an ordered set denoted by T(K). 

3 Lattices of Triadic Concept Graphs 

For studying relationships between the triadic concept graphs of a triadic power 
concept family K, it is valuable to understand the structure of the ordered set 
T(K). Supprisingly, T(K) is always a complete lattice which is isomorphic to a 
complete subdirect product of specific sublattices of the dyadic concept lattices 
(fc = 0, . . . , n and c G GonT(Ko)), extended by a new top element 
as Proposition 1 states (in general, we assume that Go H T(Ko) yf 0). 

Proposition 1 Let K := (IKo,...,K„) he a triadic power context family with 
Kfc := (Gk, Mk, B,Yk) for k = 0,...,n; furthermore, for each g G Gk and 
c G Go n T(IKo), let L!f'^^ he the ordered set with 

:= {bi3(A,Mod(c)) \gGACGk} 

together with a new top-element Then T(K) is isomorphic to the complete 

suhdirect product of the complete lattices {k G {0,...,n}, g G Gk, c G 

Go n T(Ko)) consisting of all elements a := „ of the direct product 

satisfying the following condition: 

(*) V ^ and g = {gi,...,gk) then ag' yf Tg' for i = I, . . . ,k. 

Proof: The map assigning the triadic concepts {Ai,A 2 ,A^) := hiz{A,Mod{c)) 
of ~i) to the dyadic concepts {Ai,A 2 ) of is an isomorhism 

between ordered sets; hence (Jj£’'^\~i) is a complete lattice and therefore 
too. 
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Now, let © := {V, E, v, Cg, k, a, p) be a triadic concept graph of K. For g G Gk 
and c G Go n T(Ko), let be the element if there is no m G F U if 

with {u, c) G < 7 {V) and g G p{u), and let otherwise 



:= V23({k(u) \ uGVU E,{u,c) G a(V),gG p{u)},c) G 

In the second case, we construct the “atomic” triadic concept graph consisting 
only of and u with := bi3(Gfe, Mod(c)), k{u) := a{w^) := 

{(ft, c)}, := {c},p(u) := {g} and, ifg= (gi, . . . , ^fc), also of hi, . . . , Wfc with 

iy{u) := (vi,...,Vk), K(vt) := bi3(Gfe, Mod(c)), cr(t(;^) := {(hi, c), . . . , (hfc, c)}. 
and p(hj) := {gi} for i = 1 , . . . , k. The disjoint union of all such atomic triadic 
concept graphs, having an object g as reference, is conceptually equivalent to 
the given concept graph ©. 

Furthermore, for each (g, k{u), c) G Gfe(©), we can construct an atomic triadic 
concept graph 2 l(a^®’'^(©)) as follows: If fc = 0 and g = g G Go, we define 



{v G V) with := bi3(Gfc, Mod(c)), := a[,®’'^(©), := 

{(^g)}, := |c}, and Po^’'\v) := |g}; if A: > I and g = (gi,...,gk) G Gu, 

we define 



:= {{w^ ,vi, . . . 



ds.d (g,t) 



p ' r ’) 



(ui, ...,VkGV and e G E) with := (wi, . . . ,Wfc), K^k'''\^) 

k{w^) bi3(Gfc,Mod(c)), := bi3(Gfc, Mod(c)), := |(?^i,c), 

...,(?;fe,c)}, p{w^) := |c}, and pi^’'\vi) = {g,} for f = 1, . . . , fc. 

Now, the desired isomorhism can be defined as follows (cf. [PW 99 ]): 



77*: T(K) — n I fc G ( 0 , . . . , n} and g G Gk) with 

77^(0) := ( 4 ®’‘' 4 ©) I A: G ( 0 , . . . ,7i} and g G Gk) 



is a mapping, the image of which consists of all elements of the direct product 
satisfying condition (*). It can be easily seen that this image is a complete 
subdirect product. For concept graphs ©1 and ©2, we have the equivalences 

©1 < ©2 VA: G (0, . . . ,77} V 5 G Gfc: 4®’‘'4©i) < 

77’^(©l) < T7^(©2). 



Hence 77* is an injective homomorphism from E(K) into I 0 < A: < ti, 

g G Gfc, and c G GorvT(Ko)). Thus, the assertion of the proposition is proved. □ 



Proposition 1 yields, for the triadic concept graphs of a triadic power context 
family, a system of representatives for their equivalence classes which makes 
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the generalization order transparent. These representatives are described by the 
elements of all the direct products 

{k,g,c)eU 

for which [/ is a finite subset of lJfc=o ni^} ^ Gfc ^ (Go riT(]Ko)) satisfying the 
implication 

(k,g,c) eU and g = {gi,.. .,gk)^ {0, gi, c), . . . ,{0, gk, c) € U. 

An element a := | (k,g,c) G U) of the product represents the triadic 

concept graph which is the disjoint union of the atomic triadic concept graphs 
with (k,g,c) G U. In the full direct product 

I 0 < fc < n, g G Gfe, and c G Gq n T(Ko)), 

this triadic concept graph is represented by the element which coincides with 
a on U and has outside U the corresponding top-elements as components (the 
top-element as a component indicates that g is not a reference in the 

represented graph). 

Besides the already analyzed complete lattice T(K), the ordered set T(K) 
of the classes of contextually equivalent triadic concept graphs also deserves an 
investigation to gain more insights into the relationships between triadic concept 
graphs of a triadic power context family K. We obtain that also this ordered set 
is a complete lattice: 

Proposition 2 Let K := (Ko,...,K„) be a triadic power context family with 
Kfc := (Gfc, Mfc, B, Yk) for k = 0, . . . ,n. Then TfK) is isomorhic to the complete 
lattice 

n 

n »(n, Gk X T(Kfc) X (Go n T(Ko)), 4) 

with {g,m,b)Ik{h,b,c) g h or g ^ {Int{b) x Mod{c))^^^ or m ^ Int{b) 
or b ^ Mod{c). 

Proof: For a triadic concept graph 0 := {V,E,i/,C-g^,K,a,p) of K we define 
l{&) := {Yk \ Yk{e), {Yk \ n(0))A). Since 4c(0) = U(^ \ {h, b, c)G | {u, c) G 
a{w) with w G dom{a), b = k{u), and h G Gk\p{u)), we have that Gfc\Ffc(0) is 
an extent and so i(0) a formal concept of the formal context {Yk, Gk x T(Kfe) x 
(Go n T(Ko)), 4) for k = 0, . . . ,n. Now, it is straightforward to prove that t is 
even the desired isomorphism. □ 

The conceptual content and the contextual content are interesting notions 
for understanding which information might be given by triadic concept graphs. 
But, Figure 1 shows that there are even further interesting ideas of content, 
because the melodic character of the presented triadic concept graph has been 
grasped neither by its conceptual content nor by its contextual content. 
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Abstract. The frequently used strategy of forming hypotheses, based on 
observations, to generate predictions can be formalized in several ways. 
We study an elaborated approach that has some conceptual flavour and 
translate the basic ideas in the language of Formal Concept Analysis, 
blazing the trail for applications to other conceptual structures as well. 
We investigate the relation to pseudo intents, discuss algorithmic ques- 
tions, and give an example. 



Motivation 

Imagine that on your way to work there is a junction where you frequently run 
into a traffic jam. You would like to know in advance whether you are likely to 
get in the hold up or not, because then you could take a detour to avoid a delay. 
But you known of no single obvious cause for the congestion. You know that 
certain constellations are very likely to cause such a problem, and you also know 
of situations where it is rather safe. Probably you will develop your personal 
hypotheses, based on your positive and negative experiences, to predict traffic 
jams at that junction. 

A similarly structured problem, concerning pharmacological applications, is 
known as the Structure-Activity Relationship (SAR) Problem [14], where struc- 
tural properties of chemical compounds (such as particular subgraphs of their 
molecular graphs) are used as predictors for certain biological activities (like 
mutagenicity, sedativity, or such). 

The logical framework of such problems has been extensively studied. One 
of the broader approaches is called the JSM-method (after John Stuart Mill, an 
English philosopher of the 19th century, who was one of the first to systematically 
consider schemes of inductive reasoning). It was proposed by V. Finn [3] in 1983 
and has been developed ever since by his group at the Moscow VINITI Institute. 
From the standpoint of Artificial Intelligence, that method can be considered as a 
formal system of inductive plausible reasoning based on examples and knowledge 
about the domain (see Finn [4]) or as a method of Machine Learning that employs 
examples and counterexamples of a goal attribute (compare [10]). 

The original formalization of the JSM-method used by Finn was the language 
of first-order predicate calculus with two sorts of variables, six truth- value types 
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and “quanitification over tuples with variable length” [1]. However, the under- 
lying combinatorial structure of the JSM-method suggests analogies to another 
method of knowledge processing, that of the Formal Concept Analysis (FCA) [6] . 
Some attempts to connect these approaches were already made in [11], [12]. In 
this paper we propose a partial translation from the language of the JSM-theory 
into that of FCA.^ 

1 Hypotheses 

We consider a finite set M of “structural attributes”, a set G of objects (or 
observations) and a relation / C G x M, such that {g, m) G / if and only if object 
g has the attribute m. Such a triple K = (G, M, I) is called a formal context. 
It is the basic data type of Formal Concept Analysis. Using the derivation 
operators, defined for A C G, B C M hy 

A' := {m G M \ gim for all g G A}, 

B' := {g G G \ gIm for all m G B}, 

we can define a formal concept (of the context K) to be a pair (A, B) satisfying 
A Q G, B C M, A' = B, and B' = A. A is called the extent and B is called 
the intent of the concept {A,B). These concepts, ordered by 

(Ai,Hi) > (A2,H2) 4=^ Ai A A 2 

form a complete lattice, called the concept lattice of K := (G,M, /). Double 
application of the derivation operators, i.e., {B')' or (A')' are abbreviated as B" 
or A", repsectively. It can easily be shown that " is a closure operator, i.e., it is 
monotone, idempotent, and extensive. Further on, this operator will be referred 
to as closure and sets A C G, B C M such that A" = A, B” = B as closed. 

In what follows we shall need the notion of implication between attribute sets 
[6]. An implication A ^ B between a pair of subsets of the attribute set M 
holds for a given formal context K := (G, M, /) if every object that has all the 
attributes from A also has all attributes from B. This is equivalent to A' Q B' . 

In addition to the structural attributes of M we consider a goal attribute 
w ^ M. This divides the set G of all objects into three subsets: The set G+ 
of those objects that are known to have the property w (these are the positive 
examples), the set G_ of those objects of which it is known that they do not 
have w (the negative examples) and the set Gt of undetermined examples, i.e., 
of those objects, of which it is unknown if they have property w or not. This 
gives three subcontexts of K = (G, M, I): 

K+ := (G+,M, J+),]K_ := {G-,M,I_), and := (G^,M,G), 

where for £ G {-I-, — , r} we have A := / fl Gg x M . 

^ In terms of the JSM-theory, we consider only “counterexample forbidding hypothe- 
ses” and the “atomistic case,” where there is a single goal attribute 
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Intents are, as defined above, attribute combinations shared by some of the 
observed objects. In order to form hypotheses about structural causes of the goal 
attribute w, we are interested in sets of structural attributes that are shared by 
some positive, but by no negative examples. Thus, a positive hypothesis, or 
a (+)-hypothesis, for w is an intent of K+ that is not contained in the intent g' 
of any negative example g € G_. A negative hypothesis, or a (-)-hypothesis, 
is defined accordingly. If g is an object such that its intent^ 

g' := {m € M \ (g,m) € 1} 

contains a positive or a negative hypothesis then we say, for short, that g contains 
that hypothesis. 

We intend to use these hypotheses as predictors for the undetermined ex- 
amples, predicting that an object g € Gr has the goal attribute if it contains a 
positive hypothesis and does not have w if it contains a negative one. These cases 
may be not exclusive. It may happen that an object contains both a positive and 
a negative hypothesis. An object that contains a positive, but no negative hy- 
pothesis will be classified positively. Negative classifications are defined similarly. 
If g' contains hypotheses of both kinds, or if g' contains no hypothesis at all, 
then the classification is contradictory or undetermined, respectively. The under- 
lying principle can be formulated in the line of J.S.Mill as “common effects are 
brought about by common causes.” We may restrict to minimal (w.r.t. inclusion 
C) hypotheses, positive as well as negative, since an object obviously contains a 
positive hypothesis if and only if it contains a minimal positive hypothesis, etc. 

Of course there is, without further assumptions, little hope that these pre- 
dictions will be correct. Realistically, we have no reliable information on the 
unknown set Gw of those objects in G which do have w, except for G+ C 
and G_ 0 Gw = 0. We have not even excluded the possibility that the goal 
does not depend at all on the structural attributes. It might happen that there 
are positive and negative examples with exactly the same intents. All we can 
guarantee is the following, trivial fact: 

Proposition 1. A positive example g € G+ contains a positive hypothesis if 
and only if g' is not contained in the intent of any negative example. 

Nevertheless it seems natural to operate with hypotheses, at least in situa- 
tions where we expect that the goal attribute somehow depends on the structural 
attributes. An analogue of the following assumption on the (unknown) set Gw 
is called the Spinoza-axiom in [4]: 

The goal attribute w is properly implied in (G, M, I) if there are sets 
P,A/’ C V{MY such that 

1. 5 S Gw P Q g' for some P GP, 

2. g ^ Gw Pf C g' for some N G Af. 

^ For brevity sake we write g' and g” instead of {g}' and {g}” , respectively. 

® V{M) is the power set of M 
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Thus, the goal attribute is properly implied if there is a set V of positive 
attribute combinations that force the goal attribute w, and a set Af of negative 
attribute combinations that exclude tc, and if, moreover, these combinations 
would suffice to classify all objects in G. 

However, this condition does not require that these positive or negative at- 
tribute combinations have been observed as such. Let us therefore assume that 
each P and each N is supported by some example, i.e., that 

3. for each P G V there is some g G G+ such that P Q g' , 

4. for each N G Af there is some g G G- such that N C g' . 

It seems that this condition is rather strong. It does however not suffice to 
make the hypotheses work, as the following example shows: 





a b c d e f 




w 


9i 


X X 


+ 


yes 


92 


X X 


+ 


yes 


93 


X X 


— 


no 


54 


X X 


- 


no 


55 


XXX 


T 


yes, but undetermined 


56 


XXX 


T 


no, but undetermined 



In this formal context, we have P := {{a, c}, {b, c}}, Af := {{d, e}, {d, /}}. Each 

set in P is a positive hypothesis, each set in Af is a negative hypothesis. But {c} 
is also a positive hypothesis and {d} is a negative one, which has the effect that 
each undetermined object contains both a positive and a negative hypothesis. 

Note that in this example each g G G^, contains a positive hypothesis and 
every g ^ Gw contains a negative one. 

Another condition would be that whatever is not causal for the goal attribute 
will be observed to be non-causal. This may be formalized as follows: 

5. If for a set S' C G+ there is no P G P with P C S', then there is some 
h G G_ such that S' C h', 

6. if for a set T C G_ there is no G Af with N C T' , then there is some 
h G G+ such that T' C h' . 

Proposition 2. If conditions 1), 2), 5), and 6) are satisfied, then every positive 
hypothesis contains some P G V and every negative hypothesis contains some 
N G Af. 

Then, in particular, any object containing a positive hypothesis must be in 
Gw and those containing a negative hypothesis must be in the complement of Gw- 
The converse is not necessarily true: it may happen that the observed hypotheses 
are too large and that, therefore, not every g G Gw contains a positive hypothesis 
in its intent. We may then try to improve our hypotheses by enlarging G+ and 
G_ by those elements from Gt that contain a positive or a negative hypothesis, 
respectively. Conditions 1), 2), 5), and 6) will remain satisfied, but the new 
configuration may lead to new hypotheses which are smaller then the ones before. 
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This process may be iterated. For such an iteration process suggested by Finn 
there is no guarantee that the conditions 1) to 6) are fulfilled. The dynamics of 
such an iteration process will not be studied here.^ 

2 Hypotheses and Implications 

We have introduced (positive) hypotheses as attribute combinations that imply 
the goal attribute w, but with an additional “conceptual” condition: only intents 
of formal concepts are considered as hypotheses. We explain shortly why this is 
reasonable. 

Suppose that C is some attribute combination occurring only in positive 
examples, so that we may expect that C implies w. But suppose that there are 
further attributes, say di, . . . , dn, that also “come with C” in the sense that each 
observed object having C also has the set of attributes {di, ^ 2 , • ■ • , Then 
there are two typical strategies to generate a hypothesis: 

The courageous strategy would argue: “Since we have observed that w occurs 
whenever C does, we shall predict w when we observe C.” 

The cautious strategy would say: “We have observed that in case of C we also 
have di, . . . ,dn and w, so we shall predict w if we meet the same conditions, 
i.e., C U {di, . . . , dn}- If you think that the set of attributes {di, . . . , d„} 
is superfluous, give an example in which, given C, the set of conditions 
{di, . . . ,dn} does not hold, but w does.” 

The JSM-approach follows the cautious strategy. It is assumed that hypothe- 
ses are closed under implications to other structural attributes. 

However, the subsets closed under implications are precisely the intents of K. 
Therefore, it is sensible to allow only intents (of concepts of K) as hypotheses. 

There is a powerful tool for studying implications in formal contexts: the 
notion of a pseudo intent. The definition is recursive®. A pseudo intent of a 
formal context (G, M, I) is a set P C M satisfying 

1. P^P", and 

2. for every pseudo intent Q Q P, Q ^ P, we have Q” C P. 

Pseudointents can be used to give an irredundant representation of the impli- 
cational theory of a formal context, see [6] for details. We show that they are 
closely related to the minimal hypotheses: 

Let K, K_|_ and K_ be as above and let 

K± := (G+UG_,MU{w},/+U/_UG+ x {w}) 

be the context of the positive and the negative examples, extended by the goal 
attribute w in the natural way. The derivation and closure operators of this 

* For details, see [1], [3], [4]. 

® . . . and a little confusing, because there seems to be no base case. But note that 
the empty set automatically fulfills the second condition, since it contains no proper 
subsets at all. 
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context will be denoted by superscripts ^ and respectively. In particular, for 
a given set P C M of attributes we write to denote the smallest intent of 
that contains P. Derivation and closure operators for contexts K_|_ and K_ 
will be denoted by superscripts +, “ and ++, , respectively. 

Proposition 3. Let H be some minimal positive hypothesis for w (so in partic- 
ular w ^ H). Then 

1- if P 'if H is a pseudo intent with w G then = PU {w}, 

2. there is some pseudo intent P o/K± such that P C H and P^^ = H U{w}. 

3. If P is a pseudo intent such that w ^ P but w G P^^, then P^^ \ {w} is a 
positive hypothesis^. 

Proof. Suppose that H contains a pseudo intent P with w G P^^. Then, since 
H is an intent of K+, we get P++ C P++ = H, thus P++ is a hypothesis for 
w. The minimality of H implies H = P^~^, which proves 1). 

Since = H U {w}, there must be some pseudo intent P C H with 

w G P=*=^. This gives 2). 

The closure of P is P^^ and thus P^^ \ {ru} = P++ is closed in K+ and is 
contained in no negative example, since otherwise w ^ P^^. Therefore this set 
is a hypothesis. 

In what follows, a pseudo intent P such that w ^ P and w G P^^ will be called 
(w-) generative. 



3 An Example 

To illustrate the notion of a minimal hypothesis and the relation of hypotheses 
with pseudointent-based implications we consider an expert analysis of 17 winter 
wheel chains (that improve behavior of a car on winter roads) . Information was 
given by the table below from the ABAC Magazine (1999, no. 11). 

The values of the system attribute give the type of a chain system: SK - rope 
chain (Seilkette), SRK - steel ring chain (Stahlringkette), SMS - quick mounting 
chain (Schnellmontage-System). 

The mount attribute takes the values F and F or R to denote that a chain 
of particular type can be mounted either only on the front wheels or both on 
the front and rear wheels. The values of price are given in DM, the values of 
con give the average expert asessment of the conveniency of a particular type of 
chain; the values of snow give average expert asessments of the maneuverability 
of a car, with a particular kind of chain, on snow; ice means the same for ice; 
the values of dur give average expert asessments of the durability of a particular 
kind of chain; the values of grade give average expert asessments of the general 
quality of a particular chain type. Smaller values of attributes con, snow, ice, 
dur, and grade correspond to better asessments of the corresponding chain 
properties. 

® not necessarily a minimal one 
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type 


system 


mount 


price 


con 


snow 


ice 


dur 


grade 


1 


SK 


F 


206 


1.9 


1.4 


1.8 


2.7 


1.8 


2 


SRK 


F or R 


520 


2.1 


0.8 


3.8 


2.3 


1.9 


3 


SK 


F 


160 


1.7 


1.9 


1.6 


3.7 


2.1 


4 


SK 


F 


213 


1.7 


2.0 


2.4 


3.4 


2.1 


5 


SMS 


F or R 


598 


1.6 


2.4 


2.7 


2.8 


2.2 


6 


SK 


F 


109 


2.0 


1.9 


2.4 


3.7 


2.3 


7 


SRK 


F or R 


325 


2.0 


2.1 


3.2 


2.8 


2.3 


8 


SMS 


F or R 


498 


1.5 


3.3 


3.5 


2.0 


2.4 


9 


SRK 


F or R 


396 


2.8 


2.1 


3.1 


2.5 


2.6 


10 


SRK 


F or R 


325 


2.2 


2.2 


4.6 


3.2 


2.6 


11 


SRK 


F or R 


389 


2.0 


2.2 


3.3 


4.3 


2.6 


12 


SRK 


F 


298 


2.5 


2.3 


3.3 


2.8 


2.6 


13 


SK 


F 


149 


1.9 


2.5 


4.0 


3.8 


2.6 


14 


SMS 


F or R 


684 


1.7 


3.3 


4.4 


2.2 


2.6 


15 


SK 


F 


99 


2.8 


2.2 


2.5 


4.0 


2.7 


16 


SK 


F 


140 


2.6 


2.3 


3.3 


3.4 


2.7 


17 


SK 


F 


215 


2.3 


3.8 


4.8 


2.3 


3.1 



Here the values of the type attribute substitute tradenames. 



As goal attributes we considered the grade (obtained as an average expert 
asessment of quality) and the price. Since the information was given in numerical 
values, we had to scale it before obtaining hypotheses and pseudointents. Scalings 
in both cases were similar, except for the goal attributes. 



3.1 Grade 

First, we made an assumption that the values of grade less or equal to 2.1 
testify to the high quality of an item and the values of grade greater or equal 
to 2.6 testify to the low quality of an item. Thus, items 1-4 were treated as 
positive and items 9-17 were treated as negative examples, respectively. The 
items with numbers 5-8 were neglected as those with ambiguous medium-value 
grades. Thus, the positive and negative contexts w.r.t. the goal attribute grade 
are given as follows (the values of the grade attribute are given in brackets 
to indicate that this is the goal attribute and its actual values are insignificant 
within a context, either positive or negative): 



Positive context 



type 


system 


mount 


price 


con 


snow 


ice 


dur 


(grade) 


1 


SK 


F 


206 


1.9 


1.4 


1.8 


2.7 


(1.8) 


2 


SRK 


F or R 


520 


2.1 


0.8 


3.8 


2.3 


(1.9) 


3 


SK 


F 


160 


1.7 


1.9 


1.6 


3.7 


(2.1) 


4 


SK 


F 


213 


1.7 


2.0 


2.4 


3.4 


(2.1) 
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Negative context 



type 


system 


mount 


price 


con 


snow 


ice 


dur 


(grade) 


9 


SRK 


F or R 


396 


2.8 


2.1 


3.1 


2.5 


(2.6) 


10 


SRK 


F or R 


325 


2.2 


2.2 


4.6 


3.2 


(2.6) 


11 


SRK 


F or R 


389 


2.0 


2.2 


3.3 


4.3 


(2.6) 


12 


SRK 


F 


298 


2.5 


2.3 


3.3 


2.8 


(2.6) 


13 


SK 


F 


149 


1.9 


2.5 


4.0 


3.8 


(2.6) 


14 


SMS 


F or R 


684 


1.7 


3.3 


4.4 


2.2 


(2.6) 


15 


SK 


F 


99 


2.8 


2.2 


2.5 


4.0 


(2.7) 


16 


SK 


F 


140 


2.6 


2.3 


3.3 


3.4 


(2.7) 


17 


SK 


F 


215 


2.3 


3.8 


4.8 


2.3 


(3.1) 



The present example is not yet fully compatible with our definitions from Sec- 
tion 1. To obtain a formal context, we apply a conceptual scaling (see [6] for 
details) replacing a given many-valued attribute by one-valued ones. We only 
list the scale attributes. The first two scales are nominal, the other are ordinal. 

system SK SRK SMS 



mount 


F 




F 


or R 








price 


< 


160 


< 


215 


< 


520 


> 520 


con 


< 


2.1 


< 


2.5 


> 


2.5 




snow 


< 


2.0 


> 


2.0 








ice 


< 


2.4 


< 


3.0 


< 


4.0 


> 4.0 


dur 


< 


3 


< 


3.7 


> 


3.7 





The table is read as follows. Original many- valued attributes are listed in the 
first column. Each many-valued attribute staying in the beginning of the row 
is replaced by several Boolean attributes that stay in other row positions. For 
example, the many-valued attribute system is replaced by attributes SK, SRK, 
and SMS, so that each object gets exactly one of them. The attribute mount is 
scaled similarly. This type of scaling is called nominal in [6]. The many- valued 
attribute price is replaced by four Boolean attributes < 160, < 215, < 520, and 
> 520. In contrast to the nominal attributes system and mount, the objects 
that have the attribute < 160, have also attributes < 215 and < 520 (in what 
follows, for brevity sake, we do not write this explicitly in the descriptions of 
object intents); the objects that have the attribute < 215 have also the attribute 
< 520. The other numerical attributes are scaled similarly. This type of scaling 
is called ordinal in [6] . 

The following positive and negative minimal hypotheses were obtained (they 
are unique, since the intersections of all positive and of all negative example 
intents are nonempty): 

~ the minimal positive hypothesis: {con < 2.1, snow < 2.0, ice < 4, dur 
< 3.7}; 

— the minimal negative hypothesis: (snow > 2.0}. 
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The generative (w.r.t. the grade attribute) pseudointents of K± are {snow 
< 2.0} and {con < 2.1, ice < 4, dur < 3.7}. 

The first one {snow < 2.0} is obviously a pseudointent of K±, since it is a 
nonclosed one-element subset of M. The second one is a pseudointent of IK±, 
since it is not closed w.r.t. ± and all its subsets are closed. 

If we introduce the “antigoal attribute” as a new attribute (in this case it 
corresponds to the “low quality of an item”), then the corresponding generative 
pseudointent will be {snow > 2.0}. 

One can see that the positive minimal hypothesis is the closure (in the pos- 
itive context with examples 1-4 and all attributes except for the general grade) 
of the both generative pseudointents (compare with Proposition 3). The impli- 
cations 



{snow < 2.0} — >■ {good quality}, 

{con < 2.1, ice < 4, dur < 3.7} — >■ {good quality}, 

which correspond to the generative pseudointents, coincide here with the minimal 
conditions for good quality. They can be used, e.g., by a producer who wants to 
attain the best sales at the lowest cost. For example, it suffices for a chain to 
make a car behave good on snow to make customer consider it as a good one. 
The implication 

{con < 2.1, snow < 2.0, ice < 4, dur < 3.7} — >■ {good quality}, 

which corresponds to the minimal positive hypothesis, informs one about the 
whole bunch of attributes relative to the notion “good chain.” This can be in- 
terpreted as a viewpoint of a customer who wants to know what is really a good 
chain. A good chain should be convenient, behave excellently on snow and at 
least satisfactorily on ice, and its life time should not be very small. 

Both viewpoints are justified at their own and the consideration of both 
provides one with multifacetous understanding of the situation under study. 

Note that for the above consideration of causes of good quality it is also 
reasonable to consider nonminimal hypotheses. For example, the (-l-)-hypothesis 

{SK, F, price < 215, con < 2.1, snow < 2.0, ice < 4.0, dur < 3.7} 

describes a class of relatively cheap chains, which have the same system and same 
mounting possibilities with good behavior on snow and satisfactory behavior on 
ice, that have good asessment of quality. One can also indicate (-) -hypotheses 

{SRK, F or R, price < 520, snow > 2.0}; 

{SK, F, price < 215, snow > 2.0}, 

which describe different classes of chains with only low quality. These classes use 
different systems, have different mounting possibilities and their prices range 
within different intervals. 

Nonminimal hypotheses can give information about taxonomy of positive and 
negative examples, which may be useful for the understanding of real causes of 
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the goal attribute or its absence. If a customer identifies the quality of a good 
with some construction and utilization specificity, then these hypotheses can 
allow him to form simple consumer heuristics like the following one: “the quality 
of SRK chains with front or rear mounting is not high, but the quality of SK 
chains with front mounting can be different depending on other parameters.” 

3.2 Price 

In the case of this goal attribute we took the items with costs less or equal to 
DM 215 to be cheap (positive examples, items 1, 3, 4, 6, 13, 15, 16, 17) and the 
items with costs greater or equal to DM 430 to be expensive (negative examples, 
items 2, 5, 8, 14). The items with numbers 7, 9, 10, 11, 12 were neglected as 
those with ambiguous medium- value prices. The grade attribute is scaled to 
obtain the following three ordinal attributes grade < 2.1, grade < 2.5, grade 
> 2.5. Other attributes were scaled in the same way as in the case of the grade 
goal attribute. 

The following positive and negative minimal hypotheses were obtained (they 
are unique, since the intersections of all positive and of all negative example 
intents are nonempty): 

— the minimal positive hypothesis: {SK, F} 

— the minimal negative hypothesis: {F or R, con < 2.1, dur < 3.0}. 

There are two generative (w.r.t. the “cheap” antigoal attribute) pseudointents: 
{SK} and {F}. 

If we introduce the “antigoal attribute” as a new attribute (in this case it 
corresponds to the “expensiveness”), then the corresponding three generative 
pseudointents are {F or R}, {con < 2.1}, and {dur < 3.0}. 

As in the case with the grade goal attribute, each pseudointent generative 
w.r.t. the goal attribute price gives small sufficient conditions for the occurrence 
of the goal attribute, but minimal hypotheses give more detailed description of 
what does a “cheap chain” mean and what does an “expensive chain” mean. 

The consideration of the hypotheses and pseudointent-based implications 
shows that the quality of chains (asessed by experts) is fairly independent of 
their price. 

Among nonminimal hypotheses that may be of interest here we can indicate 
the negative hypothesis 

{SMS, F or R, con < 2.1, snow > 2.0, dur < 3.0}, 

which describes a class of expensive chains with a certain system, certain mount- 
ing possibilities, convenient, but with bad behavior on snow. 



4 Algorithmic Problems 

As it was shown in [5], the set of all formal concepts of a formal context can be 
generated in time 0{\G\^ ■ \M\ ■ 1^(7^)!), where |®(AT)|) is the size of the concept 
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lattice of the context K, by an algorithm with polynomial delay. Recall that an 
algorithm for listing a family of combinatorial structures is said to have delay d 
[9] if and only if it satisfies the following conditions whenever it is run with any 
input of length p\ 

1. It executes at most d{p) computation steps before either outputting the 
first structure or halting. 

2. After any output it executes at most d{p) machine instructions before 
either outputting the next structure or halting. An algorithm whose delay is 
bounded from above by a polynomial in the length of the input is called a 
polynomial delay algorithm[9]. 

In [7] the algorithm Next Concept from [5] was extended for the construc- 
tion of the concept lattice for a given context (also with polynomial time delay). 
Algorithms for generating hypotheses can be found in [13]. However, to generate 
the set of all hypotheses, one can also adapt the Next Concept algorithm, 
running it in the bottom-up order (from least extents to least intents) in the 
following way. To obtain all hypotheses one should repeatedly call the following 
procedure, where H+ denotes the current set of positive hypotheses. 

Next Hypothesis 

0. A: = a hypothesis from TL+ 

1. FOUND: = false; 

2. g: = LASTJN(G+); 

3. while not (FOUND or PREVIOUSJN(g, G+) = g) do 

4. begin 

5. a 9 ^ A then 

6. begin 

7. A:=AU{g}; 

8. FOUND:= min(A++\A) > g and HYP(A+); 

9. end; 

10. A := A\{g}; 

11. g:=PREVIOUSJN(g,G+); 

12. end; 

13. next_hyp:=FOUND; 

14. n+: ^n+D{A+y, 

The function HYP(A) tests whether the positive intent A is a hypothesis, 
i.e., the condition V/ G G_ X f~ . 

For X C G+ the function min(A) returns the smallest (w.r.t. the ordering 
in G+) element of X. 

The function PREVIOUS JN(^, G+) returns (/ if (/ is the first element of G+, 
and returns the greatest element of G+ smaller than g, otherwise. 

The function LAST_IN(G_|_) returns the greatest element of G+ (i.e., the 
element with the greatest number) . 

When the algorithm that computes the set of all hypotheses by calling Next 
Hypothesis generates a new positive intent, it needs to test whether this intent 
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was not generated before and whether it is not contained in a negative example 
(line 8). Execution of line 8 takes (|G+| • \M\ + |G_| • |M|) time. If the positive 
intent is not contained in any negative example, then the process of concept gen- 
eration goes further, otherwise, the intent is not a hypothesis and the algorithm 
backtracks. Thus, the test is executed for no more than 0(|G+| • \H+\) positive 
concepts ("H+ is the set of all positive hypotheses) and the resulting time com- 
plexity is 0(|G+| • |G_| • \M\ ■ I'H+I). The algorithm for finding all hypotheses 
has delay 0((|G+p -I- |G_|) • \M\), i.e., is a polynomial delay algorithm. 

To find all minimal hypotheses, one needs to replace the line 8 with the 
following line 8* (and make some minor changes in the procedure that calls 
Next Hypothesis): 

8*. FOUND:=min(A++\A)> 5 andV/GG_(A+g /" and MINHYP(A+)); 

The function MINHYP(Jf) tests whether the hypothesis X is a minimal 
hypothesis: first, all sets (Xflg'*"), g £ G+\X maximal by inclusion are generated, 
then the condition (X fl g+) % gZ is tested for each generated set and each 
g- £ G-. 

The MINHYP test requires additional 0(|G+p|M| • |G_|) operations for at 
most |"H+| hypothesis intents, so the resulting batch algorithm that constructs 
the set of all minimal hypothesis has 0(|G+p|M| • |G_| • |'H+|) time complexity. 
The algorithm that computes all minimal hypotheses by calling Next Hypoth- 
esis with line 8* is not polynomial-delay. It is not clear whether it satisfies a 
weaker notion of efficiency, namely has cumulative polynomial delay. Recall that 
an algorithm listing a set of objects is said to have a cumulative delay d [8] if it 
is the case that at any point of time in any execution of the algorithm with any 
input of length p the total number of instructions that have been executed is at 
most d{p) plus the product of d{p) and the number of structures that have been 
output so far. The cumulative polynomial delay means that d{p) is a polynomial 
of p. 

Below, we give an incremental algorithm Next Example that modifies the 
list of all minimal positive hypotheses if the example context K is updated 
with a new object (/„. 

• If gn has the goal attribute w, then all implications — >■ w of the old 

context remain true in the new context, but it may happen that some of H £ T~L^ 
are no longer minimal hypotheses. These must be replaced hy H C\ g^ , which is 
clearly a minimal hypothesis. 

• If g„ does not have w, then all implications H ^ w of the old context with 
H ^ 9Z become false, li H £ "H™ and H <£ gZ, then H is no longer a positive 
hypothesis (it is called a falsified hypothesis), and it is deleted from "H™. There 
may be new minimal hypotheses, which must be of the form (FfUlrn})’'"^, where 
m ^ gZ ■ These new minimal hypotheses are “most general specializations” of 
the old ones relative to the new negative example. We systematically generate 
the sets {H U {m})+“'' and check if they are minimal hypotheses. 
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Therefore, our procedure is this: 

1. For every H G "H™, H C g~ we delete H from "H™. 

2. For every H G TL^, H Q g~ and every m G M \ g~ we construct F := 

{H U The set T (of tentatively new minimal hypotheses) is updated 

with F . 

3. Each element of T that is a superset of another element of T is deleted from T . 

4. The family "H™ is updated with sets from T . Finally, each element of "H™ that 
is a supersets of "H™ is deleted from "H™. 

Below we present a more formal pseudocode description of the algorithm. 
Here for two families of sets X and y the function MERGE(T, takes the 
union of X and 3^, discards every element set of it that is a superset of another 
set from the union (only one set is left for a pair of equal sets), and returns 
the resulting family of sets. Obviously, MERGE(T,T) selects all sets from X 
minimal with respect to the set-theoretic inclusion C. 

The function NEXT_IN("H!p, X, Gond) returns “true” if there is an element 
of that is lectically greater than X and satisfies condition Gond (in this case 
X takes the value of this element) and returns “false” otherwise. The function 
FIRST JN("H!p, X, Gond) returns “true” if there is an element of "H™ that sat- 
isfies condition Gond (in this case X takes the value of the lectically smallest 
such element) and returns “false” otherwise. 

Next Example 

0 X:=0, 

1 if w G g„ then 

2 if FIRST JN(-H!P, H, g g+) then 

3 repeat 

4 if for all h g gg D F[ g h'^ then 

5 n'g := n'gMH} D {H n g+} 

6 until not NEXT JN('H™, H,ggg) 

7 else 

8 if FIRST H,gg~) then 

9 repeat 

10 U'g := n^\{H} 

11 for all m ^ 5 “ do X := X U {{H U {m})++} 

12 until not NEXTJN(-H!p, H, C g~) 

13 if X yf 0 then 

14 begin 

15 T := MERGE{F, F) 

16 Wg := MERGE{Wg, F) 

17 end 

Complexity. If (/„ is a new positive example (i.e., w G g^), then the algorithm 
terminates in time 0{\H7g\ ■ |G_| • |M|), since it intersects each old minimal 
hypothesis with gg and tests the inclusion of the result in the intent of each 
negative example (lines 1-6). 
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If Qn is a new negative example (i.e., w ^ 9n)j then in the worst case it will 
take much more time to terminate. The lines 9-12 are executed in the worst case 
for each element of "H™ and for each m G g~ and, therefore, require 0(|'H™| • 
\M\ ■ |G+| • |M|) time, since line 11, the most time-consuming operation of the 
cycle 9-12, is done in 0(|G+| • \M\) time. 

Since T has at most I'H+I • \M\ elements, the lines 15 and 16 can be executed 
in 0(|'H™P|Mp) time each. Note that line 15 is not necessary, it can reduce the 
execution time in practice, but does not affect the upper bound of the worst-case 
complexity. Thus, the complexity of executing lines 14-17 is also 0{\TL™\'^\M\'^) 
and the total time complexity of the algorithm is 0(|'H!pP|Mp-|-|'H!p| • |Mp|G|). 

Note that the worst-time complexity of the algorithm is quadratic in the 
size of the set of all minimal hypotheses and not of the set of all hypotheses. 
Therefore, when the number of all minimal hypotheses is much less than the 
number of all hypotheses, the incremental algorithm can operate faster than the 
batch algorithm, whose time complexity is linear in the number of all hypotheses. 

5 Conclusion 

We considered the JSM-method of generating hypotheses from positive and neg- 
ative examples in terms of Formal Concept Analysis. We presented conditions 
that ensure correct behavior of hypotheses relative to examples. We showed how 
hypotheses and minimal hypotheses are related to pseudointent-based implica- 
tions. We proposed a batch algorithm for computing all hypotheses and/or all 
minimal hypotheses in time linear in the number of all hypotheses. We also pro- 
posed an incremental algorithm for computing all minimal hypotheses, which 
runs in time quadratic in the number of minimal hypotheses. The example con- 
cerning consumer properties of wheel chains illustrated the relationship between 
hypotheses and implications. 
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Abstract. Formal Concept Analysis is an algebraic model based on a 
Galois connection that is used for symbolic knowledge exploration from 
an elementary form of the representation of data (“formal context”). 
The aim of this paper is to design the theoretical models required for the 
extension of Formal Concept Analysis to any kind of lattice-structured 
set of properties (“generalized formal concept”), and especially the case 
of predicates. Beyond their theoretical interest, the models aim at better 
solving real applied problems thanks to an improvement of the knowledge 
representation skills. 



1 Introduction 

1.1 Formal Analyses 

As far as knowledge information is concerned, the induction of clustered groups 
of entities from a context described through their features is a pivotal topic. Gen- 
erally, if numerical valuations (belief measures, preferences...) can be defined on 
the considered data, numerical or mixed methods can be directly used: classi- 
cal classification tools, fuzzy methods and, closer to the symbolic community, 
rough sets [20] or cartesian space model [13]... In some cases, requirements or 
accessibility constraints imply to rest only on symbolic attributes. Thus, more 
fundamental models and techniques are required; Formal Concept Analysis - also 
called “Galois Lattices” - [10] is a suitable candidate for such a purpose. Even 
if FCA theory relies on an elementary form of the representation of data (the 
’’cross table”), the use of more complex data types, such as ’’many- valued con- 
texts”, is also possible. The data have then to be reduced to the basic type by a 
method of interpretation. A large amount of applications require to character- 
ize the context with more detailed attributes (e.g. properties with arguments: 
Speed{vehiclei, 12), Aircraft{B0727 , landing)). Defining an adequate method 
of interpretation which clarify all the links between such attributes is not ob- 
vious. The purpose of this paper is to give a new framework which allows the 
concept lattice to be calculated without reducing the data to the basic type when 
objects are described by properties within a lattice-structured set, such as sets 
of literals from a first order logical language. 

The relations between FGA and first-order logic have been studied, especially 
in [22] . In this work the model- logic approach is adopted and the entities are con- 
sidered as constant symbols that appear as arguments of predicates representing 
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the properties. This consistent formal construction does not aim at directly rep- 
resenting knowledge of a priori given entities with their attributes. Our approach 
may be considered as closer to the original purpose of FCA as it is more focused 
on the formal prerequisites of a consistent model of predicate-logic like FCA 
and its programming conditions. This leads to search for first-order operators 
corresponding to the fundamental ones in FCA: set union, set intersection, and 
the Galois connection (in fact, nothing else but these three operators is needed 
to generate the FCA theory) . In a first step, an extension of FCA to any kind 
of lattice-structured set of properties is built: the Generalized Formal Concept 
Analysis (say: G-FCA). Then a lattice structure is given to sets of literals by the 
Cube model and a predicate based model is designed and implemented in Prolog: 
the Cubical Formal Concept Analysis ( C-FCA ). C-FCA is a formal extension of 
FCA (every context of FCA is a context of G-FCA without any transforma- 
tion), while G-FCA is a general frame for both. Nevertheless, it is clear that 
every concept lattice generated by G-FCA (or C-FCA) is isomorphic to some 
concept lattice of FCA. Reducing the data of the G-FCA (or C-FCA) to the 
basic type in order to build this isomorphic concept lattice with FCA theory is 
always possible, but difficult. 

Foreword: In the context of the paper the proofs are omitted, all of them are com- 
pletely detailed in [8] and [16]. 



1.2 Application 

Throughout this paper, a very simple example of aeronautical incidents analysis 
is used. The information about aeronautical problems comes from many reports 
produced by the crews of the implied airplanes. Among the tremendous amount 
of these heterogeneous incident reports, the main objective of the aeronautical 
community is to find out means to discover precursors of accidents. This supposes 
to have tools to represent knowledge and to discover links between features of 
incidents and accidents. In our example (which was extracted from large data of 
the Nasa incidents database [17]), the data are represented in a synthetic way 
by table 1. 



Table 1. Description of aeronautical accidents 



Accident 


Persons 


P 

Plane 


aircraft 
Flight Phase 


Scenario 


Accl 


crew(B0727) 

ATC 


B0727 

DC9 


Take Off 
Take Off 


Airmissground 


Acc2 


ATC 

crew(MD82) 


B0727 

MD82 


Landing 

Landing 


Airmissground 


Acc3 


ATC 


B0727 

DC9 


Landing 
Take Off 


Airmissground 


Acc4 




MD82 


Landing 


Windshear 
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For example, the first line captures (i) that in the accident denoted “Accf” 
the crew of the Boeing 727 and the air traffic control (ATC) have both a part 
of responsibility, (ii) that both aircraft (Boeing 727 and DC9) were landing and 
(iii) that the global scenario of the incident was a so-called “airmissground” . 

The aim of data analysis is to cluster the accidents and to point out the speci- 
ficity of each set. The better the connections between accidents are described, 
the better the typical dangerous configurations are highlighted. 

2 A Generalization of Formal Concept Analysis 

Formal Concept Analysis (say: FCA) is a set-theoretical model for concepts that 
captures the philosophical understanding of a concept as a context-based unit of 
thought consisting in two parts: the extent^ which contains all the entities (the 
objects, the examples...) belonging to the concept, and the intent, which is the 
collection of all the attributes (the characteristics, the properties...) shared by 
the entities [1]. Based upon Galois connections, FCA was described in [2] and 
in the 80 ’s Rudolf Wille designed a dedicated theory and launched a research 
program at the University of Darmstadt. An introduction can be found in [9] 
and FCA theory is now described in the reference book of Canter and Wille [10]. 

In FCA, the basic notion that models the knowledge about a specific domain 
is the formal context -described as a binary relation between two sets- from 
which concepts and conceptual double hierarchies can be formally derived so as 
to form the mathematical structure of a lattice^ with respect to a subconcept- 
superconcept relation. FCA is used for self-emergent classification of objects, 
detection of hidden implications between objects, construction of concept se- 
quences, object recognition, aggregation of data and information, knowledge 
representation and analysis. FCA is also frequently used as a preprocessing tool 
for classification [6]. 

Classical definitions of FCA are not recalled, but the method is exemplified 
with a naive interpretation of table 1. 



2.1 Example of Fundamental Formal Concept Analysis 

Let the data of the table 1 be reduced to the table 2 which define a relation R 
between the set of objects 0={Accl ,Acc2,Acc3,Acc4} and the set of attributes 
P = {crew-B0727, crew-MD82, ATC, B0727-take-off, B 0727-landing, DC9- 
take-ojf, MD82-landing, Airmissground, Windshear}. Ci = (0,P,R) is a formal 
context. 

The Galois connection - denoted ' - joins a set of objects with the set of 
attributes common to objects and conversely it joins a set of attributes with the 

^ given two internal operators fl (infimum) and U (supremum) on a set E, {E, fl, U) is 
a lattice, iffde/: H and U are idempotent, associative, commutative and they verify 
the absorption laws x\l{x\-\y) = x and x\-\{x\ly) = x. A lattice is always an ordered 
set: the relation < defined on E as: {x < y) <r^def (x\ly) = x is an order relation for 
which n and U represent the greatest lower bound and the least upper bound [3]. 
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Table 2. A naive interpretation of the table 1 



R 


crew- 

B0727 


crew- 

MD82 


ATC 


B0727- 

take-off 


B0727- 

landing 


DC9- 

take-off 


MD82- 

landing 


Airmiss- 

ground 


Wind- 

shear 


Acel 


X 




X 


X 




X 




X 




Acc2 




X 


X 




X 




X 


X 




AccS 






X 




X 


X 




X 




Accf 














X 




X 



set of objects which have all those attributes. We have: {Accl, Acc2}’ = {ATC, 
Airmiss ground} and {ATC, Airmis s ground} ’ = {Accl, Acc2, AccS}. 

A concept of this context is a pair {A, B) with A a set of objects, B a set of 
attributes, A' = B and B' = A. 

As {Accl, Acc2, AccS}' = {ATC, Airmissground} , the pair ({Accl, Acc2, 
AccS}, {ATC, Airmissground}) is a concept of C\. 

In fact, 10 concepts are induced by C\: 

1: ({Accl, Acc2, AccS, Acc4},{}) 

2: ({Accl ,Acc2, AccS} , {ATC, Airmissground} ) 

3: ({Acc2,Acc4} ,{MD82-landing}) 

4: ({Acc2, AccS}, {ATC, Airmissground, B0727-landing} ) 

5: ({Accl , AccS} , {ATC, Airmissground, DC9-take-off} ) 

6: ({Acc4} ,{MD 82-landing, Windshear}) 

7: ({Acc2}, {ATC, Airmissground, B0727-landing,Crew-MD82,MD82-landing} ) 

8: ({AccS} , {ATC, Airmissground, DC9-take-ojJ,B0727-landing} ) 

9: ({Accl}, {ATC, Airmissground, DC9-take-off,B0727-take-off, Crew-B0727} ) 
10: ({},P) 

These concepts have a lattice structure as illustrated by figure 1 (the Hasse- 
diagram of the lattice) . Traditionally, on such a diagram, an element labeled by 
an object Obj represents the concept with the smallest set containing Obj, and 
an element labeled by a property Prop represents the concept with the smallest 
set containing it. Hence, a given concept © inherits all the properties which are 
linked above it in the diagram and 0 is constituted of all the objects which are 
linked below it. 

FCA allows us to find the common attributes of sets of accidents. Never- 
theless, we can see on the diagram that accidents 3 and 4 are not related even 
if, in the data of section 2, both accidents involved a landing plane. Thus, the 
characterization of accidents by the table 2 does not lead to describe precisely 
enough all the connections. The description of table 1 is closer to first order 
logic. Unfortunately, such a richer representation cannot be submitted to the 
simple set-theory laws as: {Aircraft(B0727 , take-off )}C\{Aircraft(DC9, take-off )}= 0 
although they clearly have common pieces of knowledge. 
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Fig. 1. The Concept Lattice L\ 



On one hand, we can try to make clear all the connections contained in the 
representation of table 1. Thus, the attribute Aircraft(B 0727, take- off) could 
be represented by four elementary properties: aircraft, B0727, take-off, B0727- 
take-off. Nevertheless, considering the description of Accl of table 1, a lot of 
elementary properties will have to be derived, such as: 

the two aircraft are taking off; 

the two aircraft are in the same flight phase; 

the crew of an airplane which is taking off is implied in the accident; 
the crew of an airplane which is in the same flight phase than a DC9 
is implied in the accident... 

Therefore it will be difficult to describe all the actual elementary properties 
which are directly captured by the first-order representation. 

On the other hand, we can keep the first-order representation of table 1 and 
search for first-order corresponding operators to the fundamental ones in FCA: 
set union, set intersection and the Galois connection. Thus, two models must be 
designed: 1: a generic extension of FCA to all kind of lattice (this is the purpose 
of the next section), 2: a lattice structure on conjunctions of predicate literals 
that copies the set-theory operators of the propositional calculus (this is the 
Cube model described in section 3.1). 



2.2 Generalized Formal Concept Analysis 

In this section, we build an extension of the fundamental definitions of formal 
concept analysis theory to the more general case in which the properties of the 
objects in O belongs to any kind of lattice. Therefore, we define the new context 
and all the related tools. 

Definition 1 

A generalized context IK is a triple (O, (£, <, FI, U), C) where O is a finite set, 
(£, <, n, U) is a lattice and ^ is a mapping from O to £. 

' and ° are two mappings defined by: 
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' : V{0) Z and ° ^ V{0) 

X ^ X' = n,exC(a:) Y° = {x&0\Y< C(x)} 

We proved in [16] that this pair of maps is a Galois connection; they satisfy 
the classical properties: 

X C Xi ^ X' > X[ X CX'° 

Y <Yi ^ Yi° CY° Y < Y°' 

Furthermore it is easy to prove the following proposition: 



Proposition 1 

Let K = (O, (£, <, n, U), be a generalized context, X a subset of O and Y a 
subset of £. 



X' = X'°' 



jeJ 



n X' 



( u Y,r = n 

j&j j&j 



We are now equipped enough to define a concept of a generalized context as 
a couple of objects and properties stable for the ' and ° operators. 



Definitions 2 

A generalized concept from the generalized context K = (O, (£, <, □, U), C) is a 
couple (A, Y) such that X C O, Y G Z, X' = Y, Y° = X. 

The set of all the concepts of the context K is denoted by T£(K). 

The order relation is defined as a mapping <C: 

<C : Tg:(K) X T€(K) — ^ { True, False} 
{{XuYi),{X 2 ,Y 2 )y^ True iff Ai C A 2 
Proposition 2 <C is an order relation on T£(K) 

Definition 3 The infimum A and the supremum V are respectively defined by: 
(Ai, Ti) A (A 2 , Y 2 ) = (Ai n A 2 , (Fi U Y 2 )°') 

(Ai,Ai)v(A2,r2) = ((AiUA2r,Finy2) 

And the fundamental theorem is: 



Theorem 3 (T£(]K), <C, A, V) is a complete lattice 



Examples 

• FCA is a particular case of G-FCA. Indeed, by replacing the triple of operators 
(<C,A,V) by (C,n,U) the definitions and results are those of the fundamental 
FGA. 

• Let be a classical first-order logic language, and the set of all well 
formed formulas. The “common sense implication^” defines a preorder on 

^ implication — >■ by itself is not a relation, it is a functor. Common sense implication 
- say - between formulas must be strictly defined as: {<p ^ tp) iffde/ b {ip ^ ip). 
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and considering its correlated equivalence relation the quotient set is 
an ordered set. Moreover (.7^;!;,, A, V) is a lattice. Stating £ = thanks to 
the generalized concept analysis, we define a complete first-order formal concept 
analysis in which each object of a finite set O is characterized by a finite set of 
well formed formulas (i.e. a knowledge base). This structure may get theoretical 
interest for Knowledge Base Clustering and more generally for Symbolic Fusion. 

But in the case of real applications based on perception of information, the 
representations of entities relies on more simple formula and first order formal 
concept analysis must be specialized so as to consider balanced complex terms 
to capture the attributes. This is the aim of the Cube model which deals with 
simplified formula as conjunctions of literals. 



3 Cubical Formal Analysis 

3.1 The Cube Model 

The cube model is dedicated to formalize conjunctions of properties and in this 
study, it will allow first order contexts to be defined. The cube model is based 
on a classical first order language {Const, Var, Funct, Pred) whose set of terms 
Term is the functional closure of Const U Var by Funct. Such an approach has 
been used for knowledge representation in intelligence systems and cooperative 
systems [7]. The elementary properties are represented by literals and the ele- 
ments of their power set C are called logical cubes^, they are interpreted as the 
conjunction of the literals. Cubes play a dual role besides the classical clauses, 
and by default their variables are existentially quantified. 

In expert or deductive systems, knowledge squares with general rules that are 
captured by clauses: c = {~<Aircraft{x, Landing), Per s{Crew{x))} represents 
the information “landing aircraft have crew”-, the associated logical formula is 
Vx (~<Aircraft(x, Landing) \/Pers(Crew(x))). When we plan to describe as a 
context the state of an observed situation, cubes are more adequate and c = 
{->Aircraft(x, Landing), Pers(Crew(x))} means: “the crew of an airplane which 
is not in the landing phase is implied in the accident”. The logical interpretation 
is 3x {-^Aircraf t{x , Landing) A Pers{Crew{x)))'^ . 

Thus, this implication is a relation between individual well formed formulas, 
which is a preorder. 

® “cube” is the name that was used for the first time by A. Thayse [21]. 

* Remark: the attributes -iAircraft(x, Landing) and -iAircraft(y, Landing) have 
the same meaning but the cubes {^Aircraft(x, Landing), Pers(Crew(x))} and 
{-iAircraft(y , Landing) , Pers(Crew(x))} have not. Hence, describing an object by the 
attribute Aircraft (x, Landing) (which is equivalent to Aircraft (y. Landing)) and by 
the attribute Pers(Crew(x)) has not the same meaning than describing the object 
by the cube {-< Air craftfx, Landing), Pers(Crew(x))}. That is why a mapping from 
O to £ is used in the definition of a generalized context rather than a relation (a 
relation cannot capture links between variables). 
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As far as knowledge representation of a context is concerned, we would 
like the order relation induced on C to capture the intuitive notion of “infor- 
mation enrichment” . But such an “enrichment” can be obtained via different 
means: quantity of information, precision of terms, logical dependency. For in- 
stance, {Aircraf t{B0727 , Landing) , Per s{Crew{x))} is more informative than 
{Pers{Crew{x))} for the number of literals is higher; and {Pers{Crew{DC9))} 
is more informative than {Pers{Crew{x))} (with x variable) for a sake of pre- 
cision. Unfortunately, the combination of both intuitive criteria may lead to 
a more complex definition of the order relation as {Aircraft(B0727,Landing), 
Pers( Crew(x))} cannot be directly compared to {Pers( Crew(DC9))}. The previ- 
ous cases highlight the need for sound definitions to the intuitive notions of union 
and intersection of two finite information sets in accordance to the following re- 
quirements: the infimum has to capture the common features (while giving more 
information than the empty set frequently generated by the unification rule). 
The supremum has to cope with the complementary criteria: quantity/precision 
of the information (while giving a more synthetic result than the set union). 
Such purposes are achieved thanks to an algebraic approach. 

Definition 4 (Vc^ € C) Ci subsumes C 2 , and we write Ci <c C 2 , when cicr C C 2 
for some substitution a. 

Cl is said to be equivalent to C 2 , c\ =c C 2 when c\ <c C 2 and C 2 <c c\. 

As <c is a preorder, =c is an equivalence relation. 

Remark: symbol C is equivalent to C and represents the set inclusion. ^ 
means ’’included but not equal” and must not be confused with ^ (not included). 

Definition 5 A cube c is reducible if there exists a substitution 9 such that c0 ^ c. 

Proposition 4 An irreducible reduction of a cube c always exists and is unique 
up to variable renaming. It is denoted as reduc(c) and the set of reduced cubes 
is denoted as C”. We also have reduc{c) =c c. 

Remark: if a cube cq does not contain any variable (i.e. cq G Cq) then 
reduc{co) = Cq. 

Example It is clear that reduc({a(x),a(l)}) = {a(l)}, but reduc({a(l,x),a(y,2)}) 
= {a(l,x),a(y,2)}. 

The order relation <c formalizes information enrichment. Two cubes are 
equivalent iff they capture the same piece of knowledge. As any information can 
be represented by a reducible cube, we will define the infimum and supremum 
operators on C”. We adopt the approach of [12] and [14] which allows a lattice on 
the terms algebra to be defined properly thanks to the anti-unification operator. 



Definition 6 Let <P be any bijection from Term x Term to Var. 
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antiuniUifisi, s^), f{ti, tm)) =def 

f {antiuni antiuni f^{tm, Sm)) 
for every function or constant symbol /, 
antiuni f,p{s,t) =def ^{s,t) otherwise. 

This definition is extended to anti-unification on literals and finally on cubes 
as a mapping: antiuni : C x C ^ C which is associative and commutative. 

Example: p{x,g{y, b)) is the anti-unified literal of p{a, g{a, b)) and p{l,g{b, b)). 

In fact, anti-unification allows the infimum to generalize the terms so as 
to properly enrich the set intersection on the cubes. The result of the anti- 
unification of two cubes Cl and C 2 is defined as the union of the anti-unification 
of every couple {hjh) based on the same predicate name and such that h be- 
longs to Cl and I 2 belongs to C 2 . 

Definition 7 Let ci and C 2 belong to C'’. The infimum and supremum operators 
Uc and He are defined on as follows: 

Cl Uc C2 =def reduc{ci U C2); 

Cl He C2 =def reduc[antiunif<p{ci, € 2 )]. 



Theorem 5 (C’’, <c, Uc, He) is a lattice. 



Remark: by a simple duality argument, a lattice structure on cubes can be 
derived from the lattice structure defined on clauses (modulo =c) by Plotkin 
(see [18] page 163). This lattice on C/=^ includes the present lattice on (up to 
variable renaming) but the definitions of infimum and supremum are presented 
here in a more mathematical way allowing an easier implementation to be made, 
as the extension of Huet’s recursive anti-unification algorithm to the cubes lends 
itself well to a CLP implementation. Thus the cube approach proposes a more 
easily computable definition of the lattice than the initial definition of Plotkin. 
Moreover this definition is restricted to a class of cubes that fits properly to cap- 
ture the knowledge of our applications; it has also been implemented in Prolog. 

Let us denote Cq the set of all the finite subsets of positive literal of the 
propositional calculus. The corollary 6 shows that the cube lattice is a formal 
extension of the propositional set-theory based lattice. 

Corollary 6 (Cq, C, U, fl) is a sublattice of (C'’, <c, Uc, He). 

3.2 Cubical Formal Analysis 

We are now ready to define a new formal concept analysis model in which the 
attributes of the objects are captured by a set of literals from a first order logical 
language instead of literals from a propositional language. Thus an object of the 
context is characterized by a logical cube. 
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Definitions 8 A cubical context is a pair (O,^) where O is finite set of objects, 
and ^ is a mapping from O to C’’. 

This is a particular case of G-FCA in which the notation of C is replaced by 
Here, each object o in O has one and only one image p = ^(o) in C'~ which 
represents the set of (predicate) properties of o. 

Definitions 9 The dual operators ' and ° between O and ■C(O) are defined by: 

=def He OiGA C(Oi) and B° =dej {Oi&0\B <c ^(Oi)} 

Thanks to G-FCA, we know that the pair of operators ' and ° is a Galois 
connection between O and ^(O) C C'’. 

Definitions 10 A cubical concept is a pair (A, B) A C O, B G such that: 

A' = B (up to variable renaming) and B° = A 
The set of all cubical concepts defined by the context (O,^) is denoted as L‘^. 

Proposition 7 For cubical concepts (Ai,i?i) and (A 2 ,i? 2 ) the relation defined 
by: {Ai,Bi) Cl (^ 2 ,- 62 ) Ai C A 2 {<^ i ?2 <c Bi) is an order relation on 

Definitions 11 The supremum: U and infimum: Cl are respectively defined on 
L‘^ as follows: 

(Ai, Bi) U (A 2 , B 2 ) =def ((Ai U A 2 ) °, Bi ric B 2 ) 

(Ai, Bi) n (A 2 , B 2 ) =def (Ai n A 2 , (Hi Uc B 2 Y ) 

Theorem 8 (L°, C, U, Cl) is a lattice. 



Example Back to our accident analysis application, table 1 is more precisely 
captured by the cubical context C 2 : 

i(Accl) = {Pers(Crew(B0727)), Pers(ATC), Aircraft(B0727,Tk0ff), 

Aircraft(DC9, TkOff), Scen(Airmissground)} 
^(Acc2) = {Pers(ATC),Pers(Crew(MD82)), Aircraft(B0727,Landing), 

Aircraft ( MD82, Landing ), Seen ( Airmissground )} 
f(Acc3) = {Pers(ATC),Aircraft(B0727,Landing), Aircraft(DC9, TkOff ), 

Seen ( Airmissground )} 

f,(Acc4) ={Aircraft(MD82, Landing), Scen(Windshear)} 

For instance, we have: 

{Accl,Acc2}' = inf(f(Accl),f(Acc2)) = {Pers(ATC),Pers(Crew(x)), 

Aircraft(B0727,y), Aircraft(x,y), Seen (Airmissground)} 

and : 

{Pers(ATC),Pers(Crew(x)), Aircraft(B0727,y), Aircraft(x,y), 

Scen(Airmissground)}° = {Accl,Acc2} 
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Clearly, as {AccJ,Acc2} is the intent of a concept, the two accidents are 
correlated and it is interesting to notice that the intent reveals that in both cases 
a B727 was involved with another aircraft a: in a collision scenario. Moreover 
they were together in the same traffic phase y (thus, may be, contributing to the 
overshoot of the ATC’s workload) and the crew of the second aircraft Crew{x) 
was also involved. This kind of mixed links was hidden through the propositional 
representation and revealed by the cubical formal analysis. Such a knowledge 
mining through incidents/ accidents reports is of a major interest for aeronautical 
safety programs. 

The 12 cubical concepts and the diagram of the concept lattice 
1: ({Accl, Acc2, AccS, Acc4},{Scen('xJ, Aircraft (y,z)}) 

2: ({Accl, Acc2, AccS} , {Per s(ATC), Aircraft(B0727,z), Scen(Airmissground)}) 
3: ({Acc2, Acc4},{Scen(x), Aircraft(MD82, Landing)}) 

4: ({Acc2, AccS] ,{Scen(Airmissground) , Aircraft(B0727,Landing), Pers(ATC)} ) 
5: ({Accl, Acc3},{Scen(Airmissground), Aircraft(DC9,TkOff), 

Aircraft(B0727,z), Pers(ATC)}) 
6: ({Acc4},{Aircraft(MD82, Landing), Scen(Windshear)}) 

7: ({Acc3},{Pers(ATC), Aircraft (B 0727, Landing), Aircraft(DC9,TkOjf), 

Scen(Airmissground)} ) 

8: ({Acc2},{Pers(ATC) ,Pers(Crew(MD82)), Aircraft (B 0727, Landing), 

Aircraft (MD8 2, Landing), Scen(Airmissground)} ) 
9: ({Accl},{Pers(Crew(B0727)), Pers(ATC), Aircraft(B0727,Tk0ff), 

Aircraft(DC9, TkOff), Scen(Airmissground)} ) 

10: /{}, {All-properties}/ 

11: ({Acc2,Acc3,Acc4},{Scen(x), Aircraft(y, Landing)}) 

12: ({Accl,Acc2},{Pers(ATC), Pers(Crew(x)), Aircraft(x,z), Aircraft (B 0727, z), 

Scen(Airmissground)} ) 

From the previous constructions (Corollary 6) it is easy to prove: 

Corollary 9 C-FCA is an extension of FCA. 

In other words: any fundamental concept lattice is a cubical concept lattice. 

For instance, it is easy to notice that the cubical concept lattice L 2 gets 
Li (see fig. 1) as a sublattice in which concepts 11 and 12 emerged. Indeed, if 
a first order context (0,f) contains no variable, it can be considered without 
any transformation as a fundamental context and its concept lattice, say Lq, 
can be computed. Due to the presence of functional terms and thanks to the 
anti-unification, the computation of its cubical concept lattice, Lq may induce 
the presence of variables, thus Lq C Lq. Nevertheless an adequate reduction 
of the data of the first order context can always lead to an isomorphic lattice 
with fundamental FCA (but the corresponding context would not be a direct 
representation of the real data). 
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Seen ( X ),A ircraft(y, z ) 



A ircraft(y, Landing ) 

0 

Aircraft(Mt)82, Landing) 



Scen(Windshear) 

Acc4 



Pers(ATC),Aircraft(B0727,z), 

Scen(Airmissground) 

Aircraft(y,x) 

A ircra ft(B0727,x) 
Pers(Crew(y)) 

Aircraft(DC9, TkOff) 




Pers(Crew(B0727)), 
Aircraft(B 72 7, TkOff) 



Fig. 2. The Cubical Concept Lattice L 2 



3.3 Implementations and Exploitations 

Prolog programs were designed so as to implement both the Cube model and 
the the Galois connection. Such programs can be considered as functions which 
associate to each context C its Concept Lattice L: C — >■ L. The determination 
algorithm relies on Theorem 3: the concept lattice L is determined by the finite 
sequence T„ of sets of concepts: 

To = {(0,P)}, Ti = {(o'“,o')|oGO} 

Pn+l — ^ Pn} 

T„„={(O,0)},L=U„,[o.„„]T„ 

In each case, the context and the concept lattice can be considered as a global 
Prolog knowledge base C U L on which knowledge exploration experiments are 
performed®. The knowledge base C U L is used so as to look for contextual 
dependencies between either objects or attributes. The induction of the context- 
based rules is given by the generic frame of [11] widely developed in Conimp 
by Burmeister [5]. For the propositional case, the techniques are based on the 
property: (VA € P), |= (A — >• {A” — A)). 

A context rule generator was implemented so as to compute the set of proposi- 
tion rules deducible form the context. But unfortunately, in the first-order case, 
the mechanisms described in the previous reference articles cannot be simply 
adapted for the following reason: 

— all the set based definitions and mappings that are classically defined must be 
completely revised based on the cube model so as to make use of predicates; 

— moreover, the subtraction operator used in the formula A — >■ {A" — A) gets 
no equivalent when predicates are concerned. 

® it must be noticed that the computing time is not a critical problem in our projects 
as the situations we are analyzing are characterized by a high complexity and a very 
low evolution time. The lattice must be completely determined and it is completely 
calculated. 
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Hence, a specific work is currently under on cube subtraction. At this time the 
basic result we use is: (VA & P), \= A ^ A° . 

The logical links between the features are captured by first order rules which 
are translated into production rules; this extension is currently intensively exper- 
imented. These programming results will not be detailed here, but one can notice 
that the cubical models allow new inner logical constraints to be taken into ac- 
count thanks to the variables. Comparisons between the C-FCA and proximate 
models such as logical scaling [19] are under study. Indeed, the favorite domain of 
FCA and C-FCA is the symbolic knowledge, but thanks to the CLP experiments, 
many improvements are expected for the analysis of large numerical databases 
as far as an new extension of the cube model is now completely defined: the 
Constrained-Cube model - in which the arguments of cubes are submitted to 
order constraints - will allow to take into account a larger scale of applications 
thanks to a CC-FCA ( Constrained Cubical Formal Concept Analysis) which is 
the next step of our study [16]. 

4 Conclusion 

Thanks to the design of a Generalized Formal Concept Analysis, the integration 
of the Cube model - a lattice structure on conjunctions of predicate literals - 
in the fundamental FCA was defined as a Cubical FCA. Its theoretical interests 
rely also on several applications in which predicates and arguments are required 
for knowledge exploration. A current study focuses on first order context based 
rules generation which related to the classical approaches of the ILP® community. 
The new applications are pilot activity modeling (in which context dependent 
incidents are searched for [15]), and the correlations between the context and 
the interactions between artificial systems and persons (e.g. noise/ annoyance 
analysis) [4]. 
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Abstract. We propose a generalization of Formal Concept Analysis 
(FCA) in which sets of attributes are replaced by expressions of an al- 
most arbitrary logic. We prove that all FCA can be reconstructed on 
this basis. We show that from any logic that is used in place of sets of 
attributes can be derived a contextualized logic that takes into account 
the formal context and that is isomorphic to the concept lattice. We then 
justify the generalization of FCA compared with existing extensions and 
in the perspective of its application to information systems. 



1 Introduction 

The origin of this work is the search for flexible organisations for managing, up- 
dating, querying, or navigating in data. In this context, several roles are played 
by possibly different people: designer, administrator, and end-user. Hierarchical 
organisations are not flexible, and updating, querying and navigation are diffi- 
cult to conciliate (see for instance the view update problem in data-bases). The 
literature shows that Formal Concept Analysis {FCA) is a good candidate for 
supporting querying and navigation. However, we feel it is not flexible enough 
as far as the description of data is concerned, and the literature on FCA insists 
more on analysing a given context than on managing contexts. In this article, we 
present an extension to FCA that allows for flexible descriptions, and we sketch 
an organisation that handles updating, querying, or navigation in data at the 
same time. 

Given a formal context {0,A,I), where O is a set of objects, A is a set 
of attributes, / is a relation between objects and attributes (i.e., a subset of 
Ox A), Formal Concept Analysis (FCA [Wil82,GW99a], Chapter 11 in [DP90a]) 
defines concepts as maximal sets of objects that share the same attributes. More 
formally, a concept is a pair {O, A) where O is a subset of O, A is a subset of A, 
such that the following relation holds: 
a{0) = A and r(A) = O 

where cr : 2^^ — >• 2-^ o-(O) := {a G A | Vo G O : (o, a) G /} 

and r : 2-^ — >■ 2® "^(A) := {o G O | Vo G A : (o, a) G /} 

The application a returns for every set of objects the set of attributes that are 
shared by all these objects. The application r returns for every set of attributes 
the set of objects that owns all these attributes. The O part of a concept is called 



B. Ganter and G.W. Mineau (Eds.): ICCS 2000, LNAI 1867, pp. 371—385, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




372 



Sebastien Ferre and Olivier Ridoux 



its extent, and the A part is called its intent. The fundamental theorem of FCA 
is that the set of all concepts that can be built on a given formal context forms 
a complete lattice when it is ordered by set-inclusion of concept extensions. In 
fact, the pair a — t forms a Galois connection. 

FCA has received attention for its application in many domains such as in 
software engineering [Sne98,Lin95,KS94]. The interest of FCA as a navigation 
tool in general has also been recognized [GMA93,Lin95,VW95]. 

The various application domains bring the need for more sophisticated for- 
mal contexts than the mere presence/ absence of attributes. For instance, many 
application domains use numerical values (e.g., lengths, prices, ages), and the 
need to express negation and disjunction is often felt. In a much more special- 
ized scope, it is imaginable to use the type of software components instead of 
attributes. Several enrichments to the attribute structure have been proposed: 
e.g., many valued attributes [GW99a], and first-order terms [CM98]. However, 
not a single extended FCA framework covers all the concrete domains, and can 
pretend covering all the concrete domains to come. So, we propose to construct a 
more general framework for concept analysis. Logical Concept Analysis (LCA), 
in which the logic of attributes becomes a parameter. This will allow for instan- 
tiating the general framework by merely filling in a dedicated logic. 

In the rest of this article, we will refer to the original form of concept analysis 
by FCA or standard CA, while we will refer to the concept analysis as it is 
developed in this article by LCA or generalized CA. The term CA will be used 
to talk about both forms at once. Section 2 goes from standard CA (FCA) to 
its generalized form, i.e., LCA. Section 3 defines how a contextualized logic can 
be derived from a logical context. Section 4 studies the contribution of LCA 
compared with existing extensions of FCA. Section 5 explains how LCA can 
provide an interesting organization framework for information systems in which 
objects are described by logical formulas. 

In this article, results are given without proofs. They can be found in [FR99]. 

2 Prom FCA to LCA 

2.1 General Presentation 

We start on reformulating standard CA so that it can be generalized more nat- 
urally. The first step of this reformulation consists in replacing context {0,A, I) 
by {O, 2-^, i), where 2-^ is the power-set of A and z is a mapping from O to 2-^ 
defined by z(o) := {a £ A \ (o,a) € /}. No information is lost, nor added, and 
both representations are equivalent: (o,a) £ I i{6) A {a}. 

Then, if applications a and r are reformulated with mapping i rather than 
relation I, the following equalities are easily obtained: 

(1) cr(0) = Pi z(o) (2) r(A) = {o£0\ i{o) A A} 

oeo 

The rest of FCA theory is kept unchanged. Now, by carefully studying proofs 
in FCA, and considering the connection between algebraic structures and lat- 
tices [DP90a], we observe that a necessary and sufficient condition to FCA is 
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that (2-^; D) be a lattice whose supremum (least upper bound) is fl, and infimum 
(greatest lower bound) is U. Then, 2-^ can be considered as a logic where 3 is 
the deduction relation, fl is disjunction, and U is conjunction. 

From this interpretation of FCA, it becomes natural to generalize it by re- 
placing the power-set of the set of attributes 2-^ by an arbitrary set of formulas £, 
to which are associated a deduction relation |=, a disjunctive operation V, and a 
conjunctive operation A (dots on symbols are only aimed at differentiating them 
from those of the meta-language used in this document). So as to keep FCA 
results in its generalized form (LCA), it is necessary and sufficient that (£; 
be a lattice whose supremum and infimum are respectively V and A. 

Example 1. An example of a logic usable in LCA is propositional logic, V. On 
the syntactic side, the set of propositions V contains atomic propositions (taken 
in a set A) , formulas 0 and 1 , and is closed under binary connectors A and V and 
unary connector On a semantical side, interpretations are subsets of the set 
of atomic propositions A. Logic (P;V,A, |=) satisfies LCA conditions, because 
its semantics (2^ ; C) is a lattice whose supremum is U, and infimum is O. 



2.2 Context and Galois Connection 

In this section, we apply the idea introduced in the last section to give a formu- 
lation of CA that is generalized to an almost arbitrary logic: Logical Concept 
Analysis (LCA) . Transposition from FCA to LCA consists in reformulating each 
occurrence of I with i, and replacing D, fl, and U respectively by V, and A. 
Proofs are obtained from the chapter 11 in [DP90a], and by applying the above 
transposition. 

Definition 1 (context) A (formal) context is a triple (0,C,i) where: 

— O is a finite set of objects, 

— (£; (=) is a (possibly infinite) lattice of formulas, whose supremum is V, and 
whose infimum is A; C denotes a logic whose deduction relation is and 
whose disjunctive and conjunctive operations are respectively V and A, 

— i is a mapping from O to C that associates to each object a formula that 
describes the object. 

If f\=g and g\=f, f and g are called logically equivalent; we will consider 
them as different representations of the same equivalence class, and in fact we will 
consider that elements of C are the equivalence classes. Also, operations V and A 
can be either connectors (as in propositional logic V), either algebraic operations 
(as in the so-called cube logic [CM98]). We just assume, for practical reasons, 
that operations A, and V are computable. A word of caution is necessary. The 
logic £ is completely independent from the set of objects. If the logic £ is rich 
enough to have quantifiers, the individuals upon which formulas are quantified 
can never be the objects of the formal context. For instance, if i{o)=Vx.p{x), this 
tells nothing about the fact that all objects have property p. It only says that 
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if i{o')=p{l) then o can be considered as being more generic than o'. A good 
interpretation of a description like i{o)='i x .p{x) is to read it as a polymorphic 
type declaration. According to the Curry-Howard isomorphism, types can be 
considered as formulas. 

FCA derives from a context two applications cr and r that form a Galois 
connection. Here, the Galois connection is between sets of objects, and formulas. 
Its definition is obtained by transposing equalities (1) and (2): 

Definition 2 Let (0,£,t) he a eontext, O QO, and / G £. 

cr:2°-^£, cr(0) := V„goz(o) r : £ -)> 2°, r(/) := {o G O | z(o)h/} 



Lemma 1 Let be a context, 0,0j C O, and f,fj G £ for all j G J, 

where J is a finite set of indices. 

(i) O C t(ct(0)) 

(a) Oi C O 2 cr(Oi)^cr(02) 

(Hi) a{0)=(j{T{a{0))) 



(iv) 






(i’) cr(r(/))h/ 

(ii’) /ih /2 ^ t(/i) C r(/ 2 ) 

(in’) T{f) = T{a{T{f))) 

(iv’) T{'f\.^jfj) = n,6jT(/j) 



(v) Wo G O : cr(r(z(o)))=z(o) 



Results expressed in this lemma are a transposition from those of standard 
GA (see Lemma 11.4 in [DP90b]), except Lemma l.(v) whose demonstration 
uses the actual definitions of a and r. 



Example 2. An example of a formal context will illustrate the rest of our de- 
velopment on LGA. Gontext ATex is deliberately small and simple as it is aimed 
at illustrating theoretic notions, and not at showing a realistic application of 
LGA. The logic used in this context is propositional logic V with a set of 
atomic propositions A = {a, b, c}. We define context by {Oex, E, iex), where 
Oex = {x, y, z}, and where iex ={x i-G- a, y 1 — >■ &, z 1 — c A (a V 6)}. 



2.3 Concepts 

In this section, we show that using our definitions of a context and Galois con- 
nection cr-r, we retrieve all existing results about concepts. First, we recall the 
definition of concepts (see Ghapter 11 in [DP90a]), just adapting notations. 

Definition 3 (concept) Ln a context (0,£,i), a concept is a pair c = (0,f) 
where O C O, and f € £, such that a{0)=f and r{f) = O. The set of objects O 
is the concept extent (written ext{c)), whereas formula f is its intent (written 
int{c) ). 

The main difference with standard GA is that the intent is now a formula 
of the logic £. The set of all concepts that can be built in a context {0,£,i) is 
denoted by C{0,£,i), and is partially ordered by defined as follows. 

Definition 4 (order <'') Let (Oi,/i) and (02,f2) be in C(0,£,T), 

(Ol,/l) <'^ (02,/2) 0\ C O 2 
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This order is compatible with order on intents. 

Proposition 1 (Oi,/i) <"= {O 2 J 2 ) (/ih/ 2 ) 

As in FCA, Definitions 3 and 4 lead to the following fundamental theorem. 

Theorem 1 Let be a context, and let J he a set of indices. The ordered 

set {C{0,L,i);<^) is a finite lattice, whose supremum (least upper hound) and 
infimum (greatest lower bound) are as follows : 

Example 3. Figure l.(a) represents the Basse diagram of the concept lattice of 
context Kex (introduced in Example 2). Concepts are represented by a number 
and a box containing their extent on the left, and their intent on the right. The 
higher concepts are placed in the diagram the greater they are for order <T. 
It can be observed that the concept lattice is not isomorphic to the power-set 
lattice of objects (2^^;^). Indeed, set {x,y} is not the extent of any concept, 
because T{a{{x, y})) = r(a V 6) = {x, y, z}. 

2.4 Labelling of Concept Lattices 

Similarly to standard CA, it is possible to label concept lattices with objects 
and formulas. 

In the perspective of our application, the designer will choose a logic £, 
an administrator will manage a formal context (0,C,i), and a end-user will 
navigate, query, and consult/create/delete designated objects by using arbitrary 
formulas as labels. The end-user knows C, but he does not necessarily know the 
formal context. For instance, he might be discovering it through navigation. In 
the following, we will ignore the possibility of labelling concepts with objects. It 
is fully described in [FR99]. 

Definition 5 Let {0,L,i) he a context. One defines a mapping /i labelling con- 
cepts with formulas: 

p.:C^ C{0,C,i) y,{f) := (r(/), cr(r(/))). 

Images of mapping p, are indeed concepts from Definition 3 of concepts, 
and from properties of applications a and r (Lemma 1). The next lemma gives 
interesting properties of these labellings of concept lattices. 

Lemma 2 Let {O, C, i) be a context, and oGO,fGC, c£ C(0, C, i) . 

(1) c <° p{f) int{c)\=f (2) p is surjective 

(3) p{int{c)) c U) int{p{f))'^f. 

Lemma 2.(1) shows that p{f) is the greatest concept whose intent logically 
entails /; and Lemma 2.(2) establishes that every concept is labelled at least 
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concept I extent I intent I I concept I formula 



(a) 



(b) 



Fig. 1. The concept lattice of context (a) and its labelling (b). 



by one formula. Regarding relation between concept intents and concept labels, 
Lemma 2.(3) shows that every concept is labelled by its intent; and Lemma 2.(4) 
adds that every formula labelling a concept is logically entailed by the concept 
intent. This means that concepts can be characterized by several formulas, the 
most precise one being its intent. This idea is developed in Section 3 and useful 
for the application sketched in Section 5. 

Example 4- Figure l.(b) represents the same concept lattice as Figure l.(a), 
but it does not associate the same information to concepts. The number of 
each concept is reused in its box so as to identify it; formulas of the form \J A 
where A C (Y 0 = 0) are placed on the right of their labelled concept. For 
instance, concept 1 is labelled by formula a (i.e., fi{a) ='^ 1). In Figure l.(b) we 
have restricted labels to be formulas of the form V but it is only to have a 
finite number of labels that are not all in the formal context; recall that every 
formula in V labels something. It is important for the applications we have in 
mind (i.e., querying and navigation) not to restrict labels to be (sub)formulas 
of the logical context. This last point implies that some concepts are necessarily 
labelled by several formulas, because there is a finite number of concepts (this 
is indeed observed with concept 6). 



3 Contextualized Logic 

In a particular context, it is possible to order some properties, although they are 
not comparable by |=. For instance, if in some context every bird flies, then we 
can say that property “bird” contextually entails the property “fly”, although 
we have not necessarily bird\=fly in C. We introduce a contextualized deduction 
relation as a generalization of implications between attributes, that can be used 
in standard CA for knowledge acquisition processes [GW99a,Sne98]. 
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Definition 6 (contextnalized deduction) Let K = (0,£,i) be a context, 
and f,g G C. One says that f contextually entails g in context K, which is 

noted f\= g, if T{f) C T{g), i.e., if every object that satisfies f also satisfies g. 

Relation |= is a preorder, and the associated equivalence relation is noted = . 
Definition 7 (contextualized logic) Let K = (0,C,i) be a context. The term 

contextualized logic denotes the partially ordered set {C^; \= ) where 
£*■ := 

WF, G€C^ ■. ^3fGF,gGG: f^"" g. 

Elements in (equivalence classes of C modulo ) are called contextu- 
alized formulas, and the class of a formula f G C is denoted by (f)^ (or more 
simply by f^, if non- ambiguous). 

Lemma 3 Let K = (0,£,i) be a context, f,g G C, and a,b G C{K). 

(1) ^ /h% (^) Vo G e> : t(o)h/ <^=^i(o)l=^/ 

(3) /h^5 Kf) <° Kg) (4) a<<^b int{a)'^^ int{b) 

(5) a(r(/))=^/ 

Lemma 3.(1) shows that contextualized deduction (= is only an enlargement 
of initial deduction |=. Lemma 3.(2) shows that object properties are not altered 
by context: it does not add nor delete any property to objects. Lemmas 3. (3-4) 

show that /r and int are order-embeddings between ) and {C{K)]<‘^). 

Lemma 3.(5) spots cr(r(/)) as the intent contextually equivalent to /. 

A context plays the role of a theory extending the deduction relation and 
enabling new entailments. Contextualized logic can also be seen as a means 
for extracting knowledge from contexts. Two kinds of knowledge can thus be 
extracted: knowledge about context by deduction, and knowledge on the do- 
main (from which context is extracted) by induction (e.g., generalizing from 

birdf^^ fly). 

Example 5. With morphism g between contextualized deduction and order on 
concepts (Lemma 3.(3)), it is possible to use the labelled concept lattice (see 
Figure l.(b)) to study contextualized logic in context ATex- For instance, as con- 
cept 2 is smaller than concept 5 relation c by c stands, which is already true 
in V. More generally, it can be seen that all valid deductions in V are retrieved 
in contextualized deductions. Exam ination of the labelled concept lattice shows 
that the context adds new valid deductions between formulas: e.g., c aV6, 

a V 6 V c ayb. 

We now consider connections between contextualized logic and concept lat- 
tice. The aim is to show that the concept lattice forms a new logic, derived 
from £, and adapted to the context: the contextualized logic. For this, we use 
labelling mappings g and int to establish a connection between formulas and 
concepts. 
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Theorem 2 {C^ \ [= ) and {C{K); <°) are isomorphic, with as an isomor- 
phism from contextualized formulas to concepts, and int^ as an isomorphism 
from concepts to contextualized formulas, defined by 

— >■ C{K) := p{f) where f G F 

int^ : C{K) — >• int^ (c) := int{c)^ 

Algebraic structures {C{K)-, V°, A°) and (£^; A^) (where V^, A^ denote 

supremum and infimum oi )) are then isomorphic. 

To summarize, there are three ways of considering a concept lattice, which is a 
sign of flexibility: (1) extents (t(ct( 2®))) ordered by set inclusion (C), (2) intents 
(cr(r(£))) ordered by logical deduction (|=), (3) contextualized formulas {C^) 

ordered by contextualized deduction ). 

Finally, we consider connections between contextualized logic and initial 
logic C. The following relations stand between operations of both logic. 

Theorem 3 Let f,gGC, 

f^A^g^=^{fAg)^ and U r(g))^ 

It should be noticed that the mapping (.)^ is a morphism from formulas to 
contextualized formulas for the conjunctive operation, but not for the disjunc- 
tive one. From this it follows that the concept lattice is not isomorphic to the 
initial logic C. Therefore, some properties of this logic can be lost in concept 
lattice. For instance, this is the case of the distributive property in context K^x 
(see Example 2). Indeed, propositional logic V is distributive, while the follow- 
ing counter-example can be found in the concept lattice of context Kgx (see 
Figure I): 2 (I 3) =“ 2, and (2 A= I) (2 A° 3) =“ 0. 

A concept lattice inherits of all properties of C (modulo =°) only if relation 
f^'f^g^=^{f'fg)^ holds. For this, it is sufficient that T(/Vg) = T{f)UT{g) (be- 
cause this implies ct(t(/) Ur(g))=^(T(T(/V (?))=*■ /Vg). In more concrete terms, 
that amounts to laying down the following condition on formulas describing 
objects: 

(3) MoGO ■.i{o) '^fyg i{o) (=/ V i{o) (=g 

This condition amounts to considering there is no form of disjunction in 
object descriptions. Back to context Kgx, it is observed that the description of 
object 2 : satisfies proposition a V 6 but satisfies neither a nor b, which falsifies 
condition (3). This explains why the distributive property is lost in the concept 
lattice. 

4 Related Works 

The first related work to consider is FCA itself since any given finite lattice is 
isomorphic to a concept lattice (see p. 27 in [GW99a]), and a formal context can 
be reconstructed from any concept lattice. This amounts to compiling C into a 
FCA formal context. But this compilation is not feasible because C is generally 
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an infinite and incomplete lattice. However, O is finite in all applications we have 
in mind (managing concrete data collections) . So, a finite number of C formulas 
is actually used in any logical context. These formulas ordered by |= form a 
diagram which could be completed in a finite lattice, that could be compiled 
into a FCA formal context. This works for defining formal concepts, but it does 
not work for labelling since any formula of C can be a label. It does not work 
either for . It is also not a good idea to compile the finite number of actually 
used formulas because the context is bound to change as often as, say, the state of 
a file system changes. So, we believe that LCA can be considered as an extension 
to FCA because it permits to handle arbitrary descriptions when we need them. 

The need for using as a formal context a more refined structure than the 
object-attribute relation has already been remarked by previous authors. Two 
directions have been followed for extending FCA. The first one is to replace 
attributes with more complex data. For instance, Chaudron and Maille replace 
attributes by first-order terms with free variables [CM98]. The underlying logic 
is that of unification and anti-unification [Plo70]. Kuznetsov replaces attributes 
by graphs that fit well his application in automated learning, and he mentions 
that such a construction can be done for all sorts of formal contexts [Kuz99]. 
In fact LCA gives a framework for this direction, which comprises labelling 
and contextualized logic. In his thesis [Mai99], Maille defines independently an 
extension of FCA that is similar to LCA. His Section 6.3 is parallel to our 
Sections 2.2 and 2.3. In fact, both works are inspired by Davey and Priestley’s 
book [DP90a]. Maille does not describe labelling and contextualized logic, but 
he describes the concrete handling of a logic with constraints. 

The second direction is to transform a more refined context into a classi- 
cal formal context. The first paragraph of this section shows an example of 
this direction. This idea has been applied to an object-attribute-value rela- 
tion [GW99b,Pre97]. The motivation is that multi- valued-contexts (in fact object- 
attribute- value relations) are widely used in the real world (e.g., in data-bases). 
There exists two methods for extracting a classical formal context from a multi- 
valued formal context. 

One method uses conceptual scales. They are formal contexts whose objects 
are values and where attributes express some properties about values (e.g., 2 
has the property < 3). The concepts that follow from such formal contexts 
define a hierarchy among scale attributes. Then a mono-valued context can be 
derived by replacing values by scale attributes according to this hierarchy. Our 
generalized context {O, C, i) can be seen as a multi-valued context where i is the 
only attribute and £ is its domain of values. Then, one has to find a scale for 
retrieving the ordering ^ on £. But this is not feasible in general, as already 
explained in the first part of this section. 

The other method is called logical scaling. It amounts to expressing attribute- 
value relations in a formalized language (e.g., SQL) using a finite (preferably 
small) collection of unary predicates. Names are given to all the predicates. Then 
a classical formal context can be constituted by replacing values associated to 
every object by the names of all the predicates that are satisfied by the values. 
As in our framework, logical scaling can be applied to an arbitrary logic. The big 
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difference is that we manage all formulas of the logic, which is in general infinite. 
It is of particular importance for applications we have in mind (see Section 5) . 

Some authors have also proposed to derive a logic from a standard for- 
mal context. E.g., Wille and Ganter propose to consider the formal context as 
defining prefered interpretations for the attributes considered as atomic propo- 
sitions [GW99b]. A propositional formula is true (a compound attribute is all- 
extensional in Wille and Ganter’s work) in the derived logic if it is true in all 
the prefered interpretations. This derived logic is in fact the contextualized logic 
of LGA when C is the logic of propositions V, and every object description is a 
complete conjunctive clause. 

5 Application to Logical Information Systems 

We plan to use LGA in the design of flexible information systems. In these 
systems, LGA is not used only for structuring a posteriori a formal context, 
but rather for structuring a priori incoming data. These systems try to merge a 
querying facility with a navigation facility. 

5.1 Querying and Navigating 

Navigating usually means following links from places to places for reaching an 
objective. Links can be directories, URLs, file system links, etc. A fundamental 
concept in navigation is the path. A path is an ordered list of links that must 
be followed to find a place or an object. If the information system has a tree 
structure every object is accessible through only one path, and the game of 
navigating is to find this path. If the structure is a graph there may be several 
paths, and the game is to find one. Navigation requires erudition to know useful 
paths, judgement for recognizing that a path may lead to what is looked for, 
and some luck. Maintaining the information system structure, and keeping it 
navigable (i.e., ensuring objects are in the proper place), is difficult. However, it 
seems like the natural thing to do in information systems who have a structure 
a priori (e.g., given by a classification). 

Querying usually means to elaborate a query that selects sufficiently few 
objects so that the relevant objects can be recognized among the answers. It 
supposes that every object have been indexed in some way. Indexes (and accord- 
ingly queries) can be sets of key-words, full-text words, etc. Just like navigation, 
querying requires erudition, judgement, and luck. Maintaining the information 
system structure is easier since the structure is essentially fiat. Gonversely, non- 
fiat structures are not well-represented. 

A concept lattice offers both navigation and querying but maintains the co- 
herence automatically. Navigation amounts to following down-links in the con- 
cept lattice; concepts are considered as places, and objects that label a concept 
are considered to be there [GMA93,Lin95,VW95]. Querying amounts to conjunct 
the intent of the current path with some query, and to find which concept this 
conjunction labels. So doing, querying and navigation can be mixed in any order. 
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and the coherence between both is maintained by the coherence of intents and 
extents. We have applied this idea to the design of a conceptual file system/shell. 

Compared with previous works, the main originalities of our proposal is to 
adopt the familiar interface of many Unix shells (i.e., commands Is, cd, rm, mv 
are transposed from a hierarchical setting to a conceptual setting), and to handle 
updates (e.g., via commands rm and mv). Typical applications are the manage- 
ment of personal directories, folders, or software development environments. We 
also envisage layman applications like catalogs or cookbooks (see [FR99] for an 
experiment with a Vietnamese cookbook). An important feature is also that 
concept lattices are used to give the semantics of the commands, but not in the 
implementation. Instead, it is the contextualized logic of Section 3 that is used. 

A short presentation of the conceptual shell is given in the next section, but 
more details can be found in [FROO]. 



5.2 A Conceptual Shell 

A conceptual shell can be described informally by comparison with a classical 
shell as follows: files become objects, paths become logical formulas, directories 
become concepts or contextualized formulas, the root becomes the concept T‘^ or 
formula T, and the working directory becomes a working concept. 

The shell commands cd and Is are transposed in the conceptual shell for 
querying and navigating. Command cd merely maintains a working query, wq, 
labelling a working concept similar to a working directory, and command Is ac- 
tually does the querying/navigation as follows. A command Is q returns a list 
of objects whose description is contextually equivalent to qkwq (i.e., the ob- 
jects labelling the concept p.{qKwq)), plus a list of derived queries that possibly 
characterize strictly smaller but non empty concepts. The principle is that if a 
derived query q' is returned, the following holds: 

(4) qi{q'AqAwq) <° fj.{qAwq) 

The fact that answers to queries may contain other queries that are relevant 
(i.e., satisfy the above inequation) is interesting in itself, and could be used in a 
natural language man-machine-interface: e.g., in a bookshop, “Q: Do you have 
the Jungle Book? A: Yes we have, and also an illustrated version in the children 
department”. Here, Jungle Book is the query. Yes we have means that there is an 
object with a contextually equivalent description (e.g., Rudyard Kipling’s best- 
known book), and illustrated version in the children department means that there 
is also an object with a contextually strictly stronger description. According to 
the context, both illustrated and children can be derived queries. If the bookshop 
had only the illustrated version, the answer could have been “Yes, in the children 
department”. 

In fact, derived queries act as sub-directories of the working directory. Just 
like the standard Is command, the conceptual logic Is command has an option -r 
which tells it to search “recursively” a directory and its sub-directories. However, 
in its details the conceptual query Is -r q is not recursive at all since it simply 
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returns the list of objects whose description satisfies qAwq (i.e., the extent of 
the concept ^{qAwq)). 

Implementing a logical information system such as described above seems 
to require building the concept lattice, and being able to compute <° and /x. 
In fact, since the contextualized logic is isomorphic to the concept lattice by /i, 

operation <'^ on concepts can be replaced by operation ^ on formulas (/x be- 
comes unnecessary), whose computation can be reduced to the computation of t 
(cf. Definition 6). Because of Definition 2, it is theref ore sufficient to memorize 
a Basse subdiagram of the ordering of formulas (|=), containing at least all the 
descriptions of objects {i{0)) to place them in the diagram, and possibly others 
like past queries. 

This gives an approximation of the concept lattice whose size is in the order 
of the number of objects instead of an exponential of this number. The concept 
lattice can be seen as representing answers to all possible queries, while the Basse 
diagram is a cache of answers to past queries. Our working hypothesis is that the 
end-user will navigate progressively, and will repetitively use the same idioms. 
One advantage of this structure is that it is not sensitive to the actual contents 
of the formal context; in particular it is not sensitive to context changes (i.e., rm 
and mv), but only to navigation steps. 

All operations of the conceptual shell can be implemented using this data 
structure and the following primitive functions. 

Extent of a query The extent of a query q is noted T{q) and returns all ob- 
jects whose description satisfies q. Operation r is the same as in the Galois 
connection of LCA (cf. Definition 2) . It is computed as the union of all ob- 
jects accessible in the Basse diagram starting from the node labelled q. It is 
similar to Is -r g in a hierarchy. 

Objects of a query The objects of a query q, noted t{q), are objects whose 
description is contextually equivalent to q (i.e., has the same extent): t{q) = 
{o G O I r(x(o)) = r(g)}. It is computed as the union of first objects acces- 
sible in the Basse diagram starting from the node labelled q. It is similar to 
Is q in a, hierarchy. 

Derived queries Inc{q) := \{x G A | 0 ^ r{xAq) ''‘(<7)}1) where \E~\ de- 
notes the set of greatest elements of E according to the order \=. Set X is 
chosen freely, but one shows that if every description of objects in r(g) can 
be expressed as a conjunction of formulas in X, then for every object in 
r(g) there is an x in Inc{q) such that the object is in r^xAq). This prop- 
erty ensures that every object is accessible through navigation. At any given 
moment in a navigation, the set of formulas of the diagram has this property. 

None of these operations actually computes \= because it is entirely cached in 
the Basse diagram. It is in command cd q that the position of qAwq is searched 
in the diagram, or inserted if necessary. It is only there that ^ is used. 
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6 Conclusion 

We have shown how FCA can be reconstructed when the formal context is not 
restricted to be an object-attribute relation. We show how to use an almost 
arbitrary logic instead. A by-product of this reconstruction is the derivation of a 
contextualized logic that adds to the logic of the formal context deductions that 
are only valid in the formal context. The contextualized logic plays the same 
role as the concept lattice, and corresponds to attribute implication [GW99b]. 

We propose to exploit contextualized logic for navigating in a conceptual 
shell. The main interest of concept analysis in this application is in tightly com- 
bining querying and navigation: contextualized formulas are at the same time 
queries and places where it is possible to read and write. Note that in this ap- 
plication, it is important that no information is lost in converting an extended 
formal context into a standard one. On one hand, the operations of the concep- 
tual shell make an intensive use of labelling functions and of the contextualized 
logic. So, it is important that concept analysis handles extended contexts di- 
rectly. On the other hand, these operations never use the concept lattice as 
such. Its role is played by the contextualized logic. This makes it easy to update 
a formal context. 

The logic used in the extended formal context is almost arbitrary, but new 
constraints apply on it when placed in the perspective of implementing a logical 
information system. Indeed, the primitive operations (e.g., deduction, conjunc- 
tion, computation of the extent of a formula) must be tractable. For instance, we 
plan to apply LCA to the domain of software engineering. In this case, descrip- 
tion logics (DL [DLNS96,DLNN97]), which are expressive though tractable logics 
can be used for version and configuration management [Zel98]; and the logic of 
type isomorphisms [Di 95] can be used for navigating in software component 
libraries. 

At a more theoretical level, two directions for future works are to study the 
possibility of having relations in the extended formal context, and to study the 
possibility given by views (e.g., like in data-bases) to hide details. This would 
be extremely useful with extended formal contexts that could be overloaded by 
details that do not concern every user. 
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Abstract. Some possible treatments of incomplete knowledge in con- 
ceptual data representation, data analysis and knowledge acquisition are 
presented. In particular, some ways of conceptual scalings as well as the 
role of the three- valued KlEENE-logic are briefly investigated. This logic 
is also one background in attribute exploration, a conceptual tool for 
knowledge acquisition. For this method a strategy is given to obtain as 
much of (attribute) implicational knowledge about a given “universe” as 
possible; and we show how to represent incomplete knowledge in order to 
be able to pin down the questions still to be answered in order to obtain 
complete knowledge in this situation. 



Introduction 

All our knowledge is incomplete. Therefore it is useful to have tools around 
to obtain as much of “certain” knowledge as possible in as many situations 
as possible. And in cases of incomplete knowledge it is desirable to be able 
to “measure” what is missing. In this note we want to present such tools for 
conceptual knowledge representation and conceptual knowledge acquisition. 

In connection with the development of the program “Conimp” of basic For- 
mal Concept Analysis,^ which centers around attribute exploration — some spe- 
cial form of knowledge acquisition in connection with FCA^ — quite soon the 
question arose, what one should do in the case that one cannot answer some of 
the questions of Conimp in this subprogram. First possible answers were tested 
in new versions of Conimp and finally published in [B9I]. Recently, in connection 
with a doctoral thesis^, much progress has been made in this direction. And the 
exploration algorithm of Conimp has been improved in such a way that now the 
expert can get maximal information about the valid attribute implications in 
the unknown context with respect to his knowledge. 

In the first section we give some basic definitions of formal contexts, incom- 
plete contexts and attribute implications. In the second section some proposi- 
tions provide conditions to check whether an attribute implication is certainly 

^ In what follows Formal Concept Analysis will be abbreviated by FCA. 

^ See section 3 of this note for more details. 

® See [HOO]. 
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valid or possibly valid in a given incomplete context. The Kleene-Logic can be 
used to characterise the validity of attribute implications, if the premise and the 
conclusion are disjoint. In the third section we present an algorithm for attribute 
exploration with incomplete knowledge. At the end of the exploration the expert 
gets a list of implications which are certainly valid in the (unknown) universe, a 
list of counterexamples for implications that are certainly not valid in the uni- 
verse, and a list of unknown implications. In this section an example is given to 
explain how the algorithm works. 

1 Fundamental Concepts 

1.1 Formal Contexts and the Meaning of Negative Entries 

Before we go into more details let us recall some of the fundamental concepts of 
FCA. For more details see e.g. Ganter and Wille [GW99]. The basic structure 
of FCA is that of a formal context (in what follows briefly called context) K i.e. 
of a triple (K =) (G, M, /), where G and M are sets, the elements of which are 
called objects and attributes, respectively, and where I C G x M is a binary 
relation between these sets, where {g,rn) G I (or gim) is read as: 

“the object g has the attribute m” or “the attribute m applies to the object 5 ”. 

Such formal contexts correspond to data tables with two possible values (e.g. 
“yes”, usually represented by a cross “x” and “no”, usually represented by a 
blanc entry, sometimes also by other symbols like “.” for a better reading of the 
table) where only one of them (e.g. “yes” or “x”) is really relevant for the 
further evaluation. A cross in the line labeled by the name of the object g (i.e. 
in the context row of g) and in the column labeled by the name of the attribute 
TO (i.e. in the context column of to) means that the object g has the attribute 
TO, yet the meaning of a blanc or period in the line for g and the column of to is 
often not quite clear. Some of the possible meanings might be that® 

(a) the object g does not have the attribute to (main case), 

(b) it is irrelevant, whether or not the object g has the attribute to, 

(c) at the time, when the table was built, it could not be decided, whether or 
not the object g has the attribute to. 

In FCA all such kinds of “negative entries” are usually treated in the same 
way, and this treatment corresponds to case (a) from above. Yet except for the 
case when also the negation of to is contained in the set of attributes a “negative 
entry” does not contribute to the formation of the “concepts”®. For this reason 
such contexts are also called one valued contexts. In what follows we shall also 
refer to them as complete contexts in order to distinguish them from incomplete 
contexts as introduced below. 

* In (possibly) incomplete contexts introduced below, we shall also use “-I-” for the 
positive and ” for the negative entries. Tables 2, 4, 5, 8, and 10 therefore contain 
examples of such formal contexts. 

® Cf. [B91] or [HOO] for further possible meanings e.g. one connected with some prob- 
ability measure. 

® Cf. the end of the next subsection. 
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1.2 Incomplete Contexts 

In this paper we shall mainly be concerned with case (c), which leads to the con- 
cept of an incomplete context, sometimes also called three-valued context, but this 
denotation is somewhat misleading (see the discussion below). By an ineomplete 
eontext := {G, M, {-h, ?, — }, J) we understand a quadruple (G, M, {-h, ?, — }, 
J), where again G and M are the sets of objects and attributes under con- 
sideration, respectively, “-I-”, and ” are the three possible entries of the 
corresponding table, and J is a ternary relation J C G x M x {-h, ?, — }, which 
can also be considered as the graph of a mapping — also designated by </ — 
J : G X M — >• {-h, ?, — } from the set G x M of all pairs of objects and attributes 
into the set {-I-,?,—} of possible values. The interpretation of the relation J is 
as follows: 

{g,m,-\~) € J: it is known that the object g has the attribute m, 

{g, m, —) G J: it is known that the object g does not have the attribute m, 
(g,m,7) G J: it is unknown, whether or not the object g has the attribute m. 

If an incomplete context (G, M, {-k, ?, — }, J) does not contain any question 
marks as entries, then we want to identify it with the corresponding complete 
context {G,M,I), where gim, if and only if {g,m,-[-) G J. 

For sets A C G of objects we define 

A° := {to G M I (g, to, -k) G j for all g & A} to be the set of all attributes in 
M applying to all objects in A — the certain intent generated by A-, 

:= (to G M I ((g,TO, -k) G J or {g,m,l) G J) for all g & A} to be the set 
of all attributes in M possibly applying to all objects in A — the possible 
intent generated by A; and dually: 

the certain extent, and , the possible extent generated by a subset B of 
M are defined in a dual way by exchanging above the roles of objects and 
attributes. 

If the context (G, M, |-k, ?, — }, J) is “actually a complete one” , i.e. if there are no 
g € G nor m G M such that {g, m,l) G J , then we have A° = A^ and B° = B^ . 
For such complete contexts we define A' := A'^ := and B' := B'^ := and 
pairs (A,B) with A = B' C G and B = A' C M are called (formal) eoncepts. 

Let = (G, M, |-k, ?, — }, J^) (j = 1, 2) be two incomplete contexts. We say 
that Kf is an extension of , if and only if for all g G G and for all to G M 

— {g, TO, -k) G implies {g, to, -k) G ,P and 

— {g,m,—) G implies {g,m,—) G J^. 

In particular, K| is called a completion of , if and only if K| is an extension 
of and it is “actually a complete context” . 

1.3 Attribute Implications 

Besides contexts attribute implications are further important tools of conceptual 
data analysis. Let K = (G, M, I) be a complete context. For sets A,B Q M oi 
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attributes one designates by A — >■ an (attribute) implication and one says that 

this implication holds {is valid) in K, if and only if, for every object g € G to 
which all attributes from A apply, all attributes from B apply, too {A C g' implies 
B C g', for all g G G)7 One can interpret A ^ B as the propositional formula, 
/\aeA ® AheB ^ where the elements of M are considered as propositional 
variables.® Such attribute implications contain a lot of information about the 
data, and the concept lattice^ of a given complete context K is determined up to 
isomorphism by the knowledge of a set of attribute implications valid in K and 
“generating” the set of all attribute implications valid in K with respect to the 
Armstrong rules (for arbitrary sets A, B, C and D of attributes): 

A^C A^B BiJC 

A-^ A ’ AUB ’ AUC D ’ 

One should observe that we use implications in the material sense, and if A is a 
set of attributes which do not jointly apply to any object of K, then A implies 
all other attributes, i.e. then A ^ M holds in K. 

In the case of an incomplete context we assume the following 
Semantical definition of validity and possibility (“our Kripke-so- 
mantics”): An attribute implication A ^ B (or any other propositional for- 
mula with the elements of M interpreted as propositional variables) holds with 
certainty (or is certainly valid) in K,, if and only if it holds in every completion 
of Ki, and it is satisfyable (or possibly valid) in K^, if and only if it holds in at 
least one completion of . 

1.4 Elimination of Question Marks by Implications 

If one wants a set X of implications to be valid in some completions of an in- 
complete context Ki := (G, M, {-I-, ?, — }, J), then it may be possible to replace 
question marks in the context row of some g G G and in the context column of 
some mG M (i.e. for {g,m,?) G J) 

— by a “-I-”, if there is A — >■ i? in I such that A C g° and m G B\ 

— by a if there is A — >■ i? in I such that m G A, A \ {m} C g'^ and 
BggO. 

As an example we consider the attribute implication {cubic} -G {not prime} 
valid for the attributes of the incomplete context shown in Table 1, which 
transforms into the complete context K also shown in this table. 

^ One also expresses this by saying “the intent of each object has to respect the im- 
plication A ^ B'\ 

® In the same way one could form other propositional (or first order) formulas and 
define their validity recursively in a similar way. Of particular interest are the clauses 
/\a€A ® ^ y b<=B ^ importance of which for FCA we cannot discuss here. 

® I.e. the ordered set ®(K) := ({(A, A) | (A,B) a concept of K},<), where, for con- 
cepts (Ai,Bi) and {A2,B2) of K, one defines (Ai,Bi) < (A2,i?2) if and only if 
Ai C A2 (or, equivalently, Bi D B2). See [GW 99 ] for more details. 

Observe that 2 ^® — 1 is a (Mersennian) prime, and that {cubic} -A- {not prime} is 
valid for all natural numbers. 
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Table 1. Elimination of “?”s by the implication {cubic} — >■ (not prime} 



Ki 


cubic 


not prime 


19^ 


-b 


? 


2^^ - 1 


? 


- 



K 


cubic 


not prime 


19'^ 


+ 


-b 


2 ^^ - 1 


- 


- 



2 Evaluation of Incomplete Contexts 

2.1 Many- Valued Contexts 

The way it is represented, an incomplete context just looks like an ordinary 
many-valued context {G,M,{Wm)m£M,J) — then with Wm = for 

all m G M — , where for each (many-valued) attribute m £ M there exists a 
non-empty set Wm of possible attribute values; and in the corresponding table 
one has as entry in the row of g and in the column of m a value J{g,m) from 
Wm-^^ In FCA many- valued contexts are usually converted by some scaling into 
a one-valued context, for which one then can use the usual methods of FCA — 
e.g. compute a basis of the set of attribute implications or draw the concept 
lattice — and in this way analyse the data. The idea of a conceptual scaling of a 
many- valued context consists in connecting with each many- valued attribute m 
a scale context Sm (for incomplete contexts this will usually be the same for all 
attributes). has Wm — or a superset of it — as set of objects, and it has new 
attributes structuring Wm according to the ideas of the user. In the process of 
scaling each entry J{g, m) is replaced — in the object row oi g — by the object 
row of J{g,m) in (in the examples below we have regrouped the columns 
afterwards) . For incomplete contexts one could e.g. take a nominal scaling S„ 
or some of the other scales shown in Table 2}^ 



Table 2. Some conceivable scales to treat incomplete knowledge 




Observe that this representation is similar to what happens in data bases. Observe, 
too, that in such many- valued tables incomplete knowledge is usually detected much 
earlier e.g. by a missing entry or by an additional value “unknown” (or a similar 
one). 

Again we refer to [GW99] for more details. Observe that the process of scaling is not 
unique and needs decisions of the user, as easily follows from our examples below. 
Here “t” stands for “true”, “f” for “false”, “u” for “unknown”, “p” for “possible” 
and “c” for “certain”. 
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The corresponding scalings of the incomplete context K„ given in Table 3 
are shown in Tables 4, 5 and 



Table 3. An incomplete context K„ 



Ku 


a 


b 


c 


9 


T 


T 


T 


h 


? 


- 


+ 


k 


- 


-b 


7 



Table 4. The scaling of by Sn 



Ks„ 


at 


bt 


Ct 


a/ 


bf 


Cf 


O’u 




Cu 


g 


X 


X 


X 














h 






X 




X 




X 






k 




X 




X 










X 



Table 5. The scaling of K„ by §pc- 



KSpc 


dp 


bp 


Cp 


dc 


be 


Cc 


9 


X 


X 


X 


X 


X 


X 


h 


X 




X 






X 


k 




X 


X 




X 





Table 6. The scalings of by Sp and Sc, respectively 



Ksp 


dp 


bp 


Cp 


9 


X 


X 


X 


h 


X 




X 


k 




X 


X 



Ks, 


dc 


be 


Ce 


9 


X 


X 


X 


h 






X 


k 




X 





Although the scalings by S„ or Spc allow to reconstruct the original in- 
complete context Kj — e.g. (ft., a,?) G J can be read from (h,ap) € Ipc and 
(ft, Oc) ^ Ipc — , the implications certainly or possibly valid in Kj can only be 
read indirectly from Kg„ or respectively. E.g. is the apposition^® 

of Kgp =: (G, M, Jp) — which we call the possibility context — and Kg_, =: 
(G,M,Jc) — which we call the certainty context of = (G, M, {-b, ?, — }, J) 
— and they allow to decide, whether a given attribute implication A ^ B is 
certainly valid or possibly valid in (although this can also be decided directly 
in Kj , as will be seen below): 

The notation Kg, shall indicate that this is a scaling of K„ w. r. t. the scale §*. 

The apposition of two contexts Ki = (G, M, 7) and K 2 = (77, N, J) with G = H 
is the “disjoint union” of the two contexts (keeping the object set fixed): Ki|]K 2 := 
{G,MUN, lUJ). 
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Proposition 1 (cv) A ^ B is certainly valid in , if and only if, for every 
object g € G, one has: if A C g^), then B\AC g°). 

(pv) A ^ B is possibly valid in Kj , if and only if for every object g € G, one 
has: if AC g°), then B C g^^{= g^). 

In [B91] we have considered in addition a Boolean resolution of an incomplete 
context Kid® This is not a scaling, yet its valid attribute implications are exactly 
the implications, which are certainly valid in K^. 

2.2 The Role of the Three- Valued KLEENE-Logic 

Having three values instead of the usual two values, and in particular their special 
meaning, suggests to use a three-valued logic. It has turned out, that none of 
the existing logics really fits “totally” to our semantical definition of validity 
and possibility. Yet one logic gets very close, as we shall see in Theorem 2, and 
that is the three valued K^leene - logic for a propositional language with logical 
operators A, V, — >■ and the truth tables of which are given in Table 7. 



Table 7 . The truth tables of the three-valued KLEENE-logic 




That this logic does not always fit to our KRiPKE-semantics is seen from 
the fact that the implication {a} — 1- {a} is valid in every completion of any 
incomplete context having a as attribute, yet one also easily realizes that it 
will get the truth value “?” whenever a gets this value. However, one has the 
following result, which is very helpful to check the validity and possibility of 
attribute implications with disjoint sets for premise and conclusion: 

Theorem 2 Let K^ = (G, M, {-h, ?, — }, J) be an incomplete context, and let <P be 
a propositional formula with propositional variables in M. Then the three-valued 
Kleene-Zo^zc computes the truth value of T> w. r. t. K^ correctly in the sense 
of our Kripke- semantics, whenever every propositional variable (i.e. attribute 
from M) occurs within T> at most once. 



Corollary 3 Let A ^ B be an attribute implication w. r. t. some incomplete 
context Ki. Then the three-valued Kleene-Zo^zc computes its truth value cor- 
rectly, when it is applied to A ^ {B\ A) instead of A ^ B. 

However, in connection with attribute exploration the corresponding algorithm was 
not given correctly, since the already accepted implications have not been considered. 
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Since the implications A ^ B and A ^ (B\A) always contain the same in- 
formation, Corollary 3 shows a way, how attribute implications can be evaluated 
syntactically w.r. t. any incomplete context. 

It has turned out that despite the modal formulation of our KRiPKE-semantics 
in connection with incomplete contexts the usual types of modal logics are not 
appropriate. They may be useful for describing the validity of a propositional 
formula for a single object, but if we consider the whole context then the above 
KRiPKE-semantics seems to fit better. 

3 Attribute Exploration 

3.1 The General Idea of Attribute Exploration 

One of the main situations, where one often has to deal with incomplete knowl- 
edge, is the method of attribute exploration in FCA^^, which is a special kind 
of knowledge acquisition. Its main intention is to obtain complete knowledge 
about the valid attribute implications in a universe, i.e. in a formal context 
Ku = (Gil, Mil, lu), where the set Gu of objects is usually too large to allow the 
context to be represented or even to be known in all details, or it may only be 
known implicitly: If some special question about it arises, then one hopes to be 
able to answer it or at least that a group of experts may agree on an answer. 
The set of attributes is a fixed finite choice of most interest among all the 
attributes conceivable for Gu. The idea of the procedure can be sketched as 
follows: 

A computer program, which can be fed in advance with some “background 
knowledge”, systematically computes in some prescribed order proposals of at- 
tribute implications which might be valid in the given universe according to the 
input given so far.^® The user (e.g. an expert or a group of experts) then has 
to accept the proposal or to refute it by a counterexample, i.e. by an object 
from the universe. If the expert can answer each question completely, she has 
produced at the end a complete subcontext K* = (G*,Mu,/u bl (Gu x Mu)) 
of Ku, where G* C Gu contains for each attribute implication not holding in 
Ku a counterexample; and she has produced a list of implications, from which 
every implication valid in the universe can be derived by the Armstrong rules 
listed in section 1.3. In what follows we mainly want to discuss how to get op- 
timal information — and how this looks like — , when one cannot answer all 
questions. 

Although most attribute explorations carried through so far have mathemat- 
ical universes, this is not necessary. For example one exploration known to us 
has been carried through about some properties of pieces of music^®; another 
one on some properties of fairy tales, and one about animals. We shall use 

Cf. also [GW99], pp. 85ff or [B91]. 

They are possibly valid in the (possibly incomplete) context computed so far. 

See [W89]. 

See [Wo93]. 
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below for an example some properties of natural numbers, since we hope that 
here everybody will be able to check the details. 

3.2 An Algorithm of Attribute Exploration 

Let Kjr = {Gn, be an (unknown) universe. At the beginning of the 

exploration algorithm the user has to enter the attribute set Mu, and he may 
enter a set "H of background implications and a start context Ki containing some 
objects of the universe. It is also possible that Ki contains questionmarks if the 
user does not always know, which attributes the objects have. 

In the j — th step, for j = 1,2,3,..., of the algorithm the set Pj contains 
the implications that are so far accepted as valid. At the beginning this set 
is initialised with the empty set: P\ := 0. In the step j the program chooses 
an attribute implication A ^ B that might be valid in the universe Ku. Here 
the set A is a minimal set (with respect to the order “C”) respecting Pj, and 
B := {m G M\A — >■ mis satisfyable in yf A. If the implication A ^ B 
is derivable from Pj U TL hy application of the Armstrong rules, then the 
implication is accepted automatically^^: Pj+i := Pj U {A — >■ B}. Otherwise the 
program asks the user for the validity of this implication in the chosen universe. 
If the user gives the answer “yes” then this implication is added to the actual 
set of accepted implications: Pj+i := Pj U {A — >• B}. If the user gives the answer 
“no” then she has to enter a counterexample g G G for this implication. The 
row with this counterexample and all its entries with respect to the attributes is 
added to the actual context (table) thus obtaining : Kj+i =: Kj + g. If the user 
gives the answer “unknown” then the program asks, for which attributes b G B 
of the conclusion the validity of the implication A — >■ 6 is unknown. Let 

Zj := {b G B\A — >■ 6 is unknown } . 

For each attribute b G Zj a, fictitious counterexample gA,b is added to the actual 
context: Kj+i := + G Zj}, where the context row of gA,b is the smallest 

row (w. r. t. the information order + >? < — ), which is a counterexample for the 
implication A — >■ 6: 





A 


0 


rest 


9A,b 


+++ 


B 


??? 



The implication A ^ B\Zj is added to the actual set of implications if H \ Zj yf 
A. After each step of the exploration algorithm the program eliminates question 
marks in “real” (i.e. not fictitious) examples using the accepted implications and 
the background implications (see section 1.4), but one will loose information, if 
this is also done in connection with fictitious ones (this could be done only at 
the end of the algorithm). The algorithm ends in step n, if for all sets A which 
respect one has A = {mGM|A— >-mis satisfyable in K„}. 

If one deals with other background knowledge than just implications — e.g. pairs 
of attributes negating each other — other rules might be necessary, too — like 
“Canter’s exhaustion rule” (cf. e.g. [GOO]). 
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After the exploration has been finished, often some superfluous fictitious 
objects must be removed from the context: If an unknown implication A — >■ 6 
follows from the accepted implications P„ , then the fictitious object gA,b must 
be removed from the context. If there is a normal (i.e. not fictitious) object g 
which is a counterexample of an unknown implication A ^ b, then the fictitious 
object gA,b is superfluous and can be removed from the context, because g is 
then also a counterexample against every implication which is not valid for gA.b- 

Let K* — with object set G* — be the context resulting from the last con- 
text K„ after removing the superfluous objects. Assume that the the user knew 
the set, say Gw , of those fictitious objects in G* , for which the corresponding 
implication is actually true in the universe Ky. 

Fact 1: Let W be the subset G* \ Gw obtained from G* by removing the 
fictitious objects in Gw- Then the restriction K*|w of the context K* contains a 
complete system of counterexamples for the implications which are not valid in 
the universe. The implications satisfiable in K*|w are exactly the valid implica- 
tions of the universe. 

Fact 2: Let U he the union of the set of accepted implications with the 
set of those unknown implications encoded by fictitious examples in Gw. Then U 
is a generating set of implications for the valid implications in the universe: The 
implications that are derivable from U (with the Armstrong rules) are exactly 
the valid implications of the universe. 

Fact 3: Lf there exists a question mark in the context row of a normal (i.e. 
not fictitious) object g in K*, then this object can he removed without changing 
the set of the satisfiable implications. Namely, for each implication A ^ B, for 
which such a g is a counterexample, there exists another object which is also 
a counterexample. — But the other object might be a fictitious one, and then 
we would loose some information by removing g, because with the information 
contained in the context without the object g we possibly can no longer decide 
whether or not such an implication A ^ B is true in the universe Ky. So in 
some cases it is better to keep such objects with question marks. 

3.3 An Example of Attribute Exploration 

As an example we take as universe Ku the context with the set N = {1, 2, 3, . . . } 
of all positive natural numbers as set of objects, and with the attributes 
even, odd, prime, square, cubic and sum2square, 
where a number n G N has the attribute sum2square if and only if there exist 
two numbers a, 6 G N with n = b^. 

Before we begin with the exploration, we enter the following three implica- 
tions as background knowledge. 

{even, odd} — >■ {prime, square, cubic, sum2square}, 

{prime, cubic} — >■ {even, odd, square, sum2 square}, 

{prime, square} — >■ {even, odd, cubic, sum2square} 

Namely, they are valid in N, since there does not exist any number which satisfies 
the premise of any one of these implications. Thus the set TL consists of these 
implications. At the beginning of the exploration we have to choose a context. 
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which we would like to start with. In our example this will be the context that 
contains as objects the numbers less than ten: 

Table 8. The (still complete) start context for the exploration 



Ki 


even 


odd 


prime 


square 


cubic 


sum2square 


1 


- 


+ 


- 


+ 


+ 


- 


2 


+ 


- 


+ 


- 


- 


+ 


3 


- 


+ 


+ 


- 


- 


- 


4 


+ 


- 


- 


+ 


- 


- 


5 


- 


+ 


+ 


- 


- 


+ 


6 


+ 


- 


- 


- 


- 


- 


7 


- 


+ 


+ 


- 


- 


- 


8 


+ 


- 


- 


- 


+ 


+ 


9 


- 


+ 


- 


+ 


- 


- 



Now the exploration program asks for the validity of implications in the 
universe Kjj.: 

First question:^^ {cubic, sum2square} — >■ {even} ? 

This implication does not hold in N and we find a counterexample: 125 = 5^ = 

102 + 52. 

We add this new object to the context and enter the context row: 





even 


odd 


prime 


square 


cubic 


sum2square 


125 


- + - 


- 


+ 


+ 



Second question: {square, sum2square} — >■ {even, odd, prime, cubic} ? 
Counterexample: 100 (= IO 2 = 82 + 02) 

The exploration continues and after step 12 the algorithm ends. During the 
exploration the three background implications of the set % are accepted auto- 
matically and the 11th question of the program is answered with “yes”: 

{even, prime} — >■ {sum2square}l 

The number 2 is the only even prime and 2 = f2 + 1^ is the sum of two squares, 
so this implication is valid. 

Without an appropriate computer program it is difficult to find a cubic square 
that is a sum of two squares but not even, so the following two questions are 
answered with “unknown”: 

6th question: {square, cubic, sum2square} — >■ {even} 

9th question: {odd, square, cubic, sum2square} — >■ {even, prime} 

These unknown implications lead to three fictitious counterexamples: 

•^1 • — 9 {square, cubic^sum2squar e} , even 
' 9‘2 • — 9{odd, square, cubic, sum2square} , even 

22 Since every implication A — >■ A is trivially true, we only list the implications in the 
form A — >• {B \ A). 
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•9^ ■ 9{oddySquare^cubic^sum2square} ^prime 

The context at the end of the exploration is given in Table 9. 



Table 9. The incomplete context with fictitious examples resulting from the attribute 
exploration 



ICl3 


even 


odd 


prime 


square 


cubic 


sum2square 


1 


- 


+ 


- 


+ 


+ 


- 


2 


+ 


- 


+ 


- 


- 


+ 


3 


- 


+ 


+ 


- 


- 


- 


4 


+ 


- 


- 


+ 


- 


- 


5 


- 


+ 


+ 


- 


- 


+ 


6 


+ 


- 


- 


- 


- 


- 


7 


- 


+ 


+ 


- 


- 


- 


8 


+ 


- 


- 


- 


+ 


+ 


9 


- 


+ 


- 


+ 


- 


- 


125 


- 


+ 


- 


- 


+ 


+ 


100 


+ 


- 


- 


+ 


- 


+ 


25 


- 


+ 


- 


+ 


- 


+ 


1000000 


+ 


- 


- 


+ 


+ 


+ 


64 


+ 


- 


- 


+ 


+ 


- 


?5i 


- 


7 


7 


+ 


+ 


+ 


?P2 


- 


+ 


7 


+ 


+ 


+ 


?53 


7 


+ 


- 


+ 


+ 


+ 



We have four implications accepted as valid: 

Pl3 = { 

{prime, cubic} — >■ {even, odd, square, sum2square} 

{prime, square} — >■ {even, odd, cubic, sum2square} 

{even, prime} — >■ {sum2square} 

{even, odd} — >■ {prime, square, cubic, sum2square} 

} 

In this special example we can find with a computer an odd cubic square 
that is a sum of two squares: 15625 = 125^ = 25^ = 35^ + 120^. This number is 
a counterexample for all three unknown implications. For each fictitious object 
g the context row of 15625 is a completion of the context row of g. Therefore 
we can replace the fictitious objects by 15625 to get a complete context which 
contains a complete system of counterexamples against the implications which 
are not valid in N w.r.t. our 6 chosen attributes. 



If we had used the additional attributes sum2primes (= “is the sum of two primes” ) 
and sum2even (= “is the sum of two even numbers”), then we could not have given 
a complete context at the end without answering the still unproven Goldbach- 
conjecture of number theory stating: “Every even number greater than 2 is the sum 
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Table 10. The complete context resulting from the attribute exploration 
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The four accepted implications form a generating system for the implications 
which are valid in our chosen universe because the unknown implications are 
not valid. The number 15625 is a counterexample for all three of the unknown 
implications, and the accepted list is also the final one in this case: U = 

Starting from version 4.14 of the program Conimp mentioned in the introduc- 
tion such an algorithm of attribute exploration is implemented, which “behaves” 
almost as described above. 



4 Conclusion 

Since all certain or possible logical information about possibly incomplete con- 
texts can be formulated by sets of clauses or attribute implications with disjoint 
premise and conclusion, the three- valued KLEENE-logic is an adequate tool to 
evaluate (such) logical statements. 

of two primes” . And in a similar way this method will often run into — old or new 
— open problems in the field of investigation. 

Cf. Fact 2, above, for the notation. 

If one runs with version 4.14 or with an earlier version of Conimp into a proposal 
of an implication A B, which one can neither prove nor disprove, one should 
not “accept it as uncertain” by the corresponding option of the program, but one 
should produce by oneself the (list of) fictitious objects for those attributes m in the 
conclusion, for which A — >■ {m} can neither be proved nor disproved. Conimp will 
then automatically ask for A ^ B \ {m}, if B \ {m} \ A 0, etc. Yet in versions 
earlier than 4.14 question marks will be changed to crosses — if possible — also in 
fictitious objects. 
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Attribute exploration is a very useful tool to get knowledge about the depen- 
dencies between attributes of an “unknown” context (universe). Our interactive 
algorithm only asks questions which are necessary to compute the set of all at- 
tribute implications which are valid in the context. Even if the expert can not 
answer all questions, he gets maximal information about the valid implications 
with respect to his knowledge and about what has been left open. 
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Abstract. Logic-based networks are semantic networks that support 
reasoning capabilities. In this paper, knowledge processing within logic- 
based networks is viewed as three stages. The first stage involves the for- 
mation of concepts and relations: the basic primitives with which we wish 
to formulate knowledge. The second stage involves the formation of well- 
formed formulas that express knowledge about the primitive concepts 
and relations once isolated. The final stage involves efficiently processing 
the wffs to the desired end. Our research involves each of these steps as 
they relate to Sowa’s conceptual structures and Wille’s concept lattices. 
Formal Concept Analysis gives us a capability to perform concept for- 
mation via symbolic machine learning. Concept (ual) Graphs provide a 
means to describe relational properties between primitive concept and 
relation types. Finally, techniques from other areas of computer science 
are required to compute logic-based networks efficiently. This paper illus- 
trates the three stages of knowledge processing in practical terms using 
examples from our research. 



1 Introduction 

The research reported in this paper investigates the viability of conceptual graphs 
(CGs [24,25]) as a knowledge representation [20,10,12,18,19], its effectiveness as 
a graphical aid to cognition [10,11,2,3], and how a formal theory of order can be 
used to compute conceptual structures in an efficient way [9,7]. 

This paper is structured in five sections. I demonstrate the use of formal 
concept analysis (FCA) by way of a project for developing a medical retrieval 
system in Section 2. There (and again in Section 3) I show how FCA can be used 
to efficiently visualise and filter a document collection [6] . The approach re-uses a 
medical thesaurus as a collection of attributes extracted from a document corpus. 
Fach document is in turn treated as an object. In small attribute collections this 
approach has merit [4] but it becomes visually intractable when large numbers 
of attributes are used. Section 2 shows how we can relieve visual complexity 
and preserve the efficiency of the lattice-based approach. This work, extended 
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{■[Problem List }■ 

{ 0. Small cell ca left lung, 8 cycles chemotherapy 1990, plus 
radiotherapy 1991. 1. Pulmonary embolus. 2. Glaucoma. 

3. Peptic ulcer. 4. Cholecystectomy. 5. Appendicectomy . 

6. Oophorectomy. 7. Right sided pneumonia and neutropaenia. }} 
{{Discharge Treatment} 

{ Coloxyl with senna 2 tabs bd. Ventolin 90 sec 4 hrly prn. 
Ranitidine 300mg bd, Mylanta 20 mis tds prn. 

Nifedipine lOmg tds, Panadeine forte 1-2 tabs 4 hrly prn, 
Dipiverine Hydrochloride . 1 2 dps bd both eyes . }} 

{{ Information to Patient } 

{ Patient aware of diagnosis and limited prognosis. 

Knows to present to LMO or RAH with any problems. }} 

{{Summary of Admission} 

{ 68 year old woman, well known to S2 . Discharged one week ago. 
Day after discharge, developed increasing SOB with 
yellow/white sputum production. Felt unwell, but denies 
rigors or chills. Using Ventolin regularly with no 
improvement. Transferred by ambulance to RAH. }} 
{{Examination} 

{ ; mildly tachypnoeac, RR 30, not cyanosed, looks unwell, 
febrile 38.9, dehydrated, HR 120/reg, BP 140/90, JVP NR, 

HS dual + nil, no ankle swelling, peripheral pulses all 
present, TML, PN dull left base, BS vesicular, reduced 
at left base, crackles right anterior chest. Abdominal 
and neurological examination unremarkable. }} 

{{Investigation} { }} 

{{ Progress } 

{ a steady improvement made. Freely mobile around the ward 
without oxygen on discharge. }} 

{{Follow Up } 

{ Chest Clinic/Dr. Holmes LMO to perform MBA20 prior to 

DPD appt . }} {{ Copies }{ lmo,file,ur. }} 

Fig. 1. A typical medical discharge document with the sub-headings. 



in Section 3 to filter email [7], is more related to FCA, but has its analog in 
concept(ual) graphs^. 

In Section 4, I illustrate the practical outcomes of the work of Prediger and 
Wille [22] on concept graphs. This theoretical framework was coded as software 
by Bernd Groh and reported in [15]. Groh’s software demonstrates how Wille’s 
Power Gontext Family (PGF) can be interpreted as concept (ual) graphs and 
provides us with an application that integrates FGA, relational databases and 
a query interface using conceptual structures^. Finally, in Section 5, a Web- 

^ A more complete version of this relationship is reported in these proceedings, R. 

Cole and G. Stumme, “CEM - A Conceptual email Manager” in this volume. 

^ see P. Eklund, B. Groh, G. Stumme and R. Wille, “A Contextual-Logic Extension 
of TOSCANA” — elsewhere in this volume. 
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Fig. 2. An abbreviated view of the document context, the objects are docu- 
ment numbers written down the rows and attributes are stemmed MeSH terms 
written across the columns. In this example there are 100 documents in the 
context. In the actual context there are slightly less than 4,000 documents and 
more than 150,000 MeSH terms. 



accessible knowledge representation toolkit is profiled [18,19]. Our claim is that 
WebKB was the world’s first precision Web-based information retrieval engine 
using conceptual graphs. Many of the lessons learned in its development are rel- 
evant to emerging Web standards such as XML/RDF. 

2 An Application of Formal Concept Analysis 

The first stage of knowledge processing within logic-based networks involves the 
formation of concepts, i.e. the basic primitives such as objects, attributes, re- 
lations, and concepts to formulate knowledge. For the formalisation of those 
primitives, we adopted the approach of Formal Concept Analysis (FCA). FCA 
has been developed during the last twenty years and has already successfully ap- 
plied to knowledge processing [27] . The Mathematics of FCA has been described 
in Canter and Wille [14]. 

Our first experience using FCA was to develop a medical retrieval system in 
cooperation with the Royal Adelaide Hospital. Using 4,000 patient discharge 
documents, one of which is shown below in Fig 1, we found a number of elec- 
tronic medical thesaurus and built a data context with documents as objects 
and medical terms as attributes: shown in Fig. 2. The project can therefore be 
considered as a logic-based network where the concepts are the medical terms 
and the relations and occurrences of those terms. 

Term identification and extraction from texts is a difficult problem. One of 
the first tasks is to semi-automate term identification. Fig. 4 shows an inter- 
face written for the clinician to located and extract MeSH [1] terms from the 
documents. The top left frame of Fig. 4 shows a document, the interface to 
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Fig. 3. A conceptual scale can be thought of as a theme that reduces the 
display and computational complexity. This hgure shows a conceptual scale 
evident in the formal context shown in Fig. 2. 



the top-right allows a text string to be entered, cross-referenced with aliased 
MeSH terms, and located in the text. The clinician is required to confirm likely 
matches, abbreviations and close spellings: once identified they are later recalled. 
This idea can be further extended to hyper-linking MeSH terms to navigate the 
document collection [16]. If we are interested in documents that contain the 
terms “carcinoma” and “Drug Therapy <1>” for example, the first document 
in this set of 214 is shown in Fig. 5. 

One of the principles of our work is the human centred nature of reasoning 
and decision making. We achieve a dramatic reduction in the complexity of both 
algorithmic and visualisation components by placing the human operator at the 
centre of the information filtering process. The human selects a theme and this 
determines the output (these themes are called conceptual scales in the FCA 
literature). This idea is shown in Fig 6. With respect to logic-based networks 
the concepts are the MeSH terms (and their synonyms) and the relations are 
the presence (or absence) of Mesh terms in the text. Fig. 4 is the interface 
that permits these relations to be asserted. Both the computer and the user are 
engaged in the process of concept identification. 

Placing the human at the centre of the information flow is obvious but of- 
ten overlooked in intelligent systems. By asking the user to refine the scope of 
the exploration by defining a conceptual scale (theme) the complexity of the 
problem and its representation are decreased. In some ways the results of our 
work with medical texts are inconclusive. As an information retrieval metaphor, 
the concept lattices and conceptual scales have not been bench-marked against 
more established IR techniques but the intuitions are promising. Complexity can 
therefore be reduced by considering the following; 
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Fig. 4. The MeSH concept extraction program — shows the text (top-left) 
together with the MeSH terms that are contained within it (middle-left). The 
program was written by Richard Cole in TCL/TK. 



I ^ Neltcape 



File Edit View Go Communicato* Help 

Problem List ~ 

0. small cell carcinoma lung . 1 . hypertension . 2. chronic airways 
obstruction . 3. pulmonary emboli x2. 4. glaucoma . 5. peptic ulcer . 

6. cholecystectomy . 7. appendectomy . 8. oophorectomy . 9. recent 
pneumonia and neutropaenia associated with chemotherapy . 

Discharge Treatment 

propine eyedrops 2 each eye, ranitidine 150mg bd, ventolin nebs 
90 sec qid, temazepam 10mg nocte, phenylephidrine nasal spray 
qid both nostrils. _| 

Information to Patient 

patient to report any symptoms of infection should they occur. 
Summary of Admission 

67 year old widow, lives alone, small cell ca lung and siadh 
diagnosed august 1990. has presented for 4th cycle of 
chemotherapy , has had no problems since the last cycle , chronic 
cough has only produced white sputum . ^ 

DTCumenfDone ' J .... 



Fig. 5. First of 214 generated HTML documents hyper-linked on the MeSH 
terms “carcinoma” and “Drug Therapy <1>”. 
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Fig. 6. Folding Poset Viewer with MeSH Concepts, displaying the theme cre- 
ated by the user: exploring a possible relationship between asthma and carci- 
noma. 



1. reducing the attribute sets M to M by selecting only those attributes rep- 
resenting a “focus” of interest. This is described as a user-defined theme or 
conceptual scale. 

2. by reducing the size of the object set G to G by collapsing documents into 
equivalent classes, i.e. if two or more documents contain the same set of 
attributes they can be considered as a single equivalence class of documents. 

3. the analysis of the distribution of attributes in the document space can 
lead to efficient encoding techniques that reduce the space complexity of the 
representation. 

Our emphasis is on visual outcomes used for text data mining. Showing that the 
visual complexity of the lattice representation can be used to explore a document 
collection in an intuitive human-centred interface is an important sub-goal. The 
graphical interface allows key terms to be identified from a medical dictionary 
of terms. It builds a theme (conceptual scale), the concept lattice which is a \/- 
homomorphic image of the whole data context. Fig. 7 shows a nested line diagram 
of a simple scale exploring the relationship between smoking and carcinoma. 

Feedback on the creation of the conceptual scale is provided in a number 
of ways. All synonyms of the key term to be scaled are shown and the number 
of documents containing a key term are displayed — this gives feedback on the 
uniqueness and relevance of the search term. For any given term, its children can 
be listed as well as its parents — this gives the user an easy way of constraining 
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Fig. 7. A nested line diagram exploring the relationship between Carcinoma, 
Chemotherapy and Smoking. The numbers attached to vertices indicate the 
number of documents that cluster to that vertex. 



search to include more documents through the use of a more general term, or 
alternatively, specialise a term to focus the search on fewer documents. Once the 
conceptual scale has been defined it may be saved together with an appropriate 
labelled line diagram of its concept lattice. W.r.t. logic-based networks, the con- 
ceptual scale is like a well-formed formula. The metaphor here is that the scale 
can only be a structural combination of concepts organised hierarchically. When 
two terms are added to the scale, their least upper bound is also added. 

The presentation of the lattice is important and some specific lattice drawing 
algorithms can be imported from the graph layout literature [8,26] although this 
issue is far from conclusively researched [5]. Of further aid is the progressive 
(un)folding of the finite lattice, this is user-controlled, but provides a mechanism 
for reducing the display complexity. 



3 Email Concept Analysis (CEM) 

An extention of the application of FCA to document processing is to look at 
emails as documents. The CEM program [6] follows from the medical document 
system described above and embodies much of the elements our research phi- 
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losophy: it is useful, it relies on humans to boot-strap its knowledge by creating 
a conceptual ontology of terms and definitions and how these manifest. It is 
flexible, it allows the operator to design his theme. It is efficient, it uses fast 
algorithms that compute in polynomial time [7]. Unlike the medical document 
project the terms are user-defined types. Each types has a “classifier” that tells 
the information retrieval system how to associate a document to a type. 

The GEM program is intended as a framework to create a visual and flex- 
ible view over email. Its purpose is to pose questions about the ways in which 
keywords combine in the email in order to extract (and find) interesting infor- 
mation. The GEM program is based on a formal context (G, M, I) where: the 
object set G consists of emails, the attribute set M consists of terms for emails, 
and / is the relation between emails and their assigned terms. 

a — >■ & is used to denote the implication determined by a formal context that: 
every object of the context that has attribute a will also have attribute b. In the 
case of email, we allow the user to state a set of implications that are enforced 
on the context. The implications are expressed by an order relation < giving 
rise to an ordered attribute set (M, <) (the order relation is reflexive, transitive, 
and anti-symmetric). A formal context, (G, M, /), is said to respect the set of 
implications given by (M,<) if {g,m) G I and m < n always imply (g,n) G I. 
Gonsequently, if an email g has attribute m and the user has indicated that 
m < n then the email will also have attribute n. 

Each of the attributes may be associated with a classifier for associating emails. 
Two examples of classifiers are shown below. An email is said to be of the 
type “From Eklund” if it contains in the email “from” field the string 
“p.eklund@gu.edu.au”. The type “KVO Meeting” is an email from 
“p.eklund@gu.edu.au” with the email subject “KVO Meeting”. 

"From Eklund" From: p.eklund@gu.edu.au 

"KVO Meeting" From: p.eklund@gu.edu.au 

Subject: KVO Meeting 

The user is asked to create an implication ordering < between admissible types. 
"From Eklund" < "From KVO Group" 

"From KVO Group" < "KVO Group" 

In the logic-based network framework, the concepts are the types, the re- 
lations between the concepts and documents is determined by the user-defined 
“classifier” . The type hierarchy gives us a way of formulating a coherent and 
connected theme/scale over the email. Inference patterns are “suggested” by 
the cardinality of emails that distribute over the lattice. The reflexive transitive 
closure of the statements made by the user gives (M, <). 

A conceptual scale (or theme) is determined by a set of types selected by the 
user. The user adds types to the conceptual view by searching over type names. 
When an type is selected it is added to the diagram showing the hierarchical 
ordering of the types. The diagram is completed by all types (ordered in advance) 
greater than a selected type. For instance, selecting the attribute “From Peter 
Deer” in Fig. 8 (top-left) forces the addition of “DSTO”). 

The concept lattice of the conceptual scale is shown below the scale in Fig. 
8; filled circles represent concepts completing the hierarchy of typed concepts 
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B| show_mail.tcl 

List or Emails 

I Hood, Stephen ' <stephen.hood(o>dsto.defence.gov.au> | CD - ITD Joint Vi 

I Peter Ekiund <peter» | close call? 

I Peter Ekiund <peter> | CS/UA Honours results 199S.. 

I Peter Ekiund <peklund@>davros.eas.gu.edu.au> | COMPUTER LABS in Ai 

I Peter Ekiund <peter> | Graduating from Adelaide 

I Pavlos PEPPAS <pavlos®mpce.mq.edu.au> | Re: Reviev 

I Richard Cole <tjcole> | I’m hackbut mail doesn’t seem to be working. 

I "Robinson, Paul " <ROBINSOP(S>sposa. er1.dsto.defence.gov.au> | 

I Simon Pollitt <sepoBit@gisca.adelaide.edu.au> | Re: Pete 



"Peter Deer" <peter_deer©macitd2.dsto.gov.au> | 


RE: pul 


1 


Selected Email 


Peter Ekiund wrote : 




> we have finally convinced Peter Deer that hie n 

> should be Latex' ed. 


V completed thesis 



Would have taken soae doing! 



I > I know you did some work on the AU style files for PhD thesis. Can 
I > we have these please? 

I OK - Tom should have the most recent, but I'm attaching my most recent 

I Note that this is definitely a 2e class file - you should use 2e 
I conventions not 2.09 or problems will arise. One main area of concern 
I is graphics - the form given in the example file should he used. 

I Avoid psfig like the plague. 

I You will have to get the crest image for Griffith (Tom should have the 

M ' ~ ~ ~ 

8 



Fig. 8. The CEM Program Interface. Top-left is the theme editor, bottom-left 
the concept lattice resulting from the theme. On the top-right email headers 
and lower-right the email contents. 



to obtain the whole lattice. The extent of the infimum of a selection of types 
concepts (example: “From Peter Deer”, “Mention Richard”) consists of all ob- 
jects having the selected attributes; the number of those objects is attached to 
the circle representing the infimum (24 in the lower-left of Fig. 8). In Fig. 8 
(bottom-left), the extent numbers show that there are 2,640 emails related to 
some research group. The 2,640 is split into 425 related to “DSTO” and 2,278 
to “KVO”, 166 emails are related to both. 

The concept lattice may be large and difficult to comprehend. A successful 
method to draw larger concept lattices is representation by nested line dia- 
grams. This is grounded on the construction of direct products of lattices. To 
draw a concept lattice using a lattice product we divide the attribute set Mg 
(user selected) into two sets Mi and M 2 . The two attribute sets can then be 
used to construct two concept lattices $(G, Mi, I (IG x Mi) and ®(G, M 2 , 1 H 
G X M2). Then the following mapping defines a V“Pr6serving embedding of 
^{G,Ms,I) into the direct product of the two constructed concept lattices: 
{A, R) H> ( (R n Ml)', B n Ml), {B fl M 2 )', B fl M 2 ) ). In Fig. 9, a nested line 
diagram of S(G, Mg, I) is shown. The elements of the first concept lattice with 
Ml := {“Groups” , “ DSTC” , “From Melfyn” , “Mention Melfyn”} is represented 
by large circles each of which contains a copy of the diagram for the sec- 
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Fig. 9. Nested Line Diagrams reduce visual complexity. 



ond concept lattice, M 2 := {“top”, “From KVO”, “From Bernd Groh”, “From Peter 
Eklund”}, “WebKB”}. 

This nesting of the line diagrams of the two lattices is a graphical represen- 
tation of the direct product of the lattices. The elements of the lattice product 
that are not mapped by concepts of S(G, Mg, I) are shown by grey circles. Grey 
circles indicate implications that come from the data rather than from the hier- 
archy, i.e. implicitly present in the data but not in the intended theory. 

In Fig. 9, where the outer scale is Mi and the inner scale M 2 , there are 113 
emails that mention Melfyn (lower-left ellipse) and 569 that are from “Peter 
Eklund” (middle vertex in upper-left ellipse). Of the 113 emails that mention 
Melfyn there are 48 that are also “From Peter Eklund” (middle vertex in lower- 
left ellipse). Of the 113 emails that mention Melfyn 48 (almost half) come from 
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Fig. 10. Bernd Groh’s CGPCF TOSCANA integration. The lower left shows 
a CG, top-left the relational database formulation of a schedule (as knowledge- 
base), top-right cross-table as a Power Gontext Family (PGF), and bottom- 
right the formal concept lattice of the cross-table. 



Peter Eklund. Melfyn is mentioned mostly by Peter Eklund and Melfyn (likely 
as a signature). Over half the email sent by Peter Eklund concerning the DTSC 
mentions “Melfyn”. These are some of the simple inferences that can be read 
from the nested line diagram in this simple example. 



4 Concept(ual) Graphs - CGPCF 

Conceptual Graphs (CG) can be used to describe more completely facts about 
the world but they have sometimes lacked a precise semantics and a scalable 
implementation. Wille developed the theory of a power context families (PCFs) 
that presents concept graphs as algebraic entities called a PGF. Bernd Groh 
developed an algorithm that can derive concept graphs from a PCF [15]. Bernd 
implemented this theory and released it as a CG implementation with a back-end 
relational database management system shown in Fig. 10. 

CGPCF is an example of the convergence of theoretical and empirical theory 
building, it is neither a general purpose programming language nor an appli- 
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cation program, it is a powerful, scalable and general technique for storing, 
searching and retrieving complex data formulated as a knowledge-base. 

From the point of view of logic-based networks, we have come up with a data 
structure that contains explicit representations of concepts and relations con- 
necting concepts. Relations are therefore no longer user-determined but rather 
user-defined. The problems that this imposes are many. If relations between con- 
cepts are explicitly contained with the text we need mechanisms that facilitate 
their extraction. The technical question is how to generate wff from the PCF and 
how to use those formulas for inference. This is a current area of investigation. 



5 WebKB 



In terms of logic-based networks we now come full circle. WebKB [18,19] is an 
important system because it is the first of its type that allows knowledge to 
be described from the Web. WebKB can be thought of as a practical example 
of logic-based networks at work to create a “semantic Web”. In this sense it 
is ahead of the emerging Web standards in XML that will permit knowledge 
embedded into Web documents. 

WebKB is a system for annotating document elements in Web-pages with 
conceptual graphs that are representative of semantic meaning. This is an ac- 
tivity that in WebKB is done by human hand. However, we have learned in 
this paper that it is possible to generate conceptual graphs automatically from 
Web documents exploiting type and relation hierarchies & concept and term 
identification techniques: storing this knowledge as a power context family from 
which a CG can automatically be derived. Once derived, the CGs can be used to 
infer more knowledge. We call this idea of bringing the semantic content of two 
or more documents together knowledge fusion. The result is a program called 
HibKB (which I will demo at IGGS2000). In the meanwhile it is worthwhile 
reviewing WebKB since it demonstrates a high-level Web-based inferencing 
system for GGs and how such a system can hyper-link a knowledge-base and 
document collection. 

Retrieving semantic content on the Web is an open research problem. Despite 
this, some of the infrastructure that supports this task is becoming available. 
Manually structuring Web documents using syntactic mark-up languages such as 
XML^ allows Web-robots to retrieve relatively precise information using string 
and structure-matching techniques. However, the Web robot approach is not 
scalable because fine-grained information is retrieved only when documents are 
richly structured and the querier understands this structure: the exact tag names 
and their order. Using representation languages as meta-data is a solution to the 
problem but it imposes several requirements on the representation language. 

A first requirement is that the notation be intuitive and concise enough to 
be read and understood by people. Most current knowledge-oriented meta-data 



® http://www.w3.org/XML/ 
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languages are built on top of XML, e.g. RDF^ and OML^. Choosing XML ensures 
that standard XML-based tools can parse and exchange meta-data. But standard 
XML tools are of little interest in managing knowledge-oriented meta-languages 
since specialised editors, analysers and inference engines are required. 

Another requirement is that an author should be able to render knowledge 
statements visible. This is particularly true when visual or control languages are 
used. Then, for example, a knowledge base and its associated documentation can 
be integrated within the same document and both accessed and managed using 
classic information retrieval techniques (string-search, structure navigation via 
table of contents, etc) as well as knowledge-based techniques. 

Another requirement of a meta-data language is that it should be both precise 
and general enough to allow users to represent any Web-accessible information at 
a desired level of precision. This means the meta-data language should have an 
expressive formal model. Any formalism equivalent to first-order logic permitting 
the use of nested “contexts” is an appropriate candidate, e.g. the Knowledge 
Interchange Format (KIF)® or Conceptual Graphs (CGs)^ [24]. It is important 
not to restrict the expressivity of the language but it is reasonable to forgo some 
features in return for efficiency. This means that an inference engine may ignore 
some features of the knowledge representation language in the same way that a 
HTML browser can exploit some feature tags but not others. 

In summary, the three requirements for precise, flexible and scalable knowl- 
edge and information retrieval implies firstly an easy to use notation that is 
intuitive, precise and expressive and secondly, the capacity to insert knowledge 
anywhere within a Web document. WebKB® [18] responds to these require- 
ments. It interprets knowledge statements stored within Web-accessible docu- 
ments. The knowledge representation language is specified with a tag attribute, 
e.g. <KR language="CG">. WebKB interprets the linear notation of CGs as well 
as some user-friendly notations: a formalised English, a frame-like notation for 
CGs, and structures that relate document elements by semantic relations®. 

CGs were selected because they have a graphical notation and a linear no- 
tation, both concise and relatively intuitive, and secondly because we can reuse 
two CG inference engines (CoGITo [17] and Peirce [13]). The inference engines 
exploit subsumption defined between formal terms for calculating specialisation 
between CGs — and therefore between queries and facts in a knowledge base. 
Hence, statements and queries can be made at varying levels of precision. 

WebKB permits virtual documents [21] by combining lexical, structural and 
knowledge-based data management by proposing commands for searching and 
joining CGs, Unix-like file management commands working on Web-accessible 



http://www.w3.org/RDF/ 

® http:/ /wave.eecs.wsu. edu/ CKRMI/OML.html 
® http:/ /logic. Stanford, edu/kif/kif. html 
http://meganesia.int.gu.edu.au/~phmartin/WebKB/doc/CGs.html 
® http://meganesia.int.gu.edu.au/~phmartin/WebKB/ 

® See “Conventions and Notations for Knowledge Representation and Retrieval”, P. 
Martin in these proceedings. 
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Fig. 11. The WebKB menu and knowledge-based Information Re- 
trieval/Handling Tool. The example query shows how a document con- 
taining CGs (indexing images) is loaded into the WebKB processor and 
then how the command spec searches for specialisations of the CG 
[Jetty] <- (Near) <- [Coco-tree] -> [On] -> [Beach] . 



documents and a simple Unix shell- like script language to combine commands. 
These commands may be inserted in documents and sent to WebKB programs. 

5.1 Representing Knowledge 

To compare the alternatives for knowledge representation and retrieval on the 
Web, Fig. 12 shows how a simple sentence may be represented with CGs, with 
KIF and with RDF. The sentence is: 

“John believes that Mary has a cousin who has the same age as her” . 

The CG representation (Fig. 12 top) seems simpler than the others. The seman- 
tic network structure of CGs (i.e. typed concepts connected by typed relations) 
has three advantages. Firstly, it restricts the notation without compromising 
expressivity — this reduces computational overhead when comparing CGs. Sec- 
ondly, it encourages users to use explicit relations between concepts, and finally 
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<KR language="CG"> 

load "http://www.bar.com/topLevelOntology"; //Import this ontology 

Age < Property; //Declare Age as a subtype of Property 

Cousin(Person, Person) {Relation type Cousin}; 

[Person: "John"] <- (Believer) <- 

[Descr: [Person: "Mary"] -{(Chrc) -> [Age : *a] ; 

(Cousin) -> [Person] ->(Chrc) -> [*a] ; 

}] ; </KR> 

<KR language="KIF"> 

load "http://www.bar.com/topLevelOntology"; //Import this ontology 

(Def ine-Ontology Example (Slot-Constraint-Sugar topLevelOntology) ) 

(Def ine-Class Age (?X) :Def (Property ?X)) 

(Def ine-Relation Cousin(?s ?p) :Def (And (Person ?s) (Person ?p))) 
(Exists ((?j Person)) 

(And (Name ?j John) (Believer ?j 

’(Exists ((?m Person) (?p Person) (?a Age)) 
(And (Name ?m Mary) (Chrc ?m ?a) 

(Cousin ?m ?p) (Chrc ?p ?a) 

))))) </KR> 

<! — RDF notation; assumed location: http://www.bar.com/example — > 

<RDF xmlns="http: //www. w3 . org/TR/WD-rdf -schema#" 

xmlns : t="http : //www. bar . com/topLevelOntology#"> 

<Class ID="Age"XsubClassOf resource="t : Property"/x/Class> 
<PropertyType ID="Cousin" comment="Relation type Cousin"> 

<range resource="t : Person"/> 

<domain resource="t : Person"/x/PropertyType> </RDF> 

<RDF xmlns="http: //www. w3 . org/TR/WD-rdf -schema#" 
xmlns : x="http : //www. bar . com/example#" 
xmlns : t="http : //www. bar . com/topLevelOntology#"> 

<t :Person bagID="Statement_01"> 

<t : Name>Mary</t : Name> 

<t : ChrcXx: Age ID="age"X/x : AgeX/t : Chrc> 

<x: CousinXt : PersonXt : Chrc resource="x : age"/x/t : Cousin> 

</t : Person> 

<Description aboutEach="#Statement_01" t :Believer=" John"/> </RDF> 



Fig. 12. Comparing knowledge representation with CGs, KIF and RDF. 



it permits a better visualisation of relations between concepts — relations can 
be rendered as an edge and concepts and a vertex. 

Users want flexibility. They do not always have time to declare and order some 
of the terms used when representing knowledge. This is the case when indexation 
is for private knowledge organisation purposes or when detail is omitted during 
work in progress. To permit this, and still allow the system to perform some 
minimal semantic checks, basic declared relation types are used and concept 





414 



Peter W. Eklund 



types left undeclared. The rationale for this is that when knowledge statements 
are made from concepts linked by basic relations, the complexity is concentrated 
within concept types and only a limited set of relation types are necessary. 
WebKB processes 200 basic relation types^*^ which collect common thematic, 
mathematical, spatial, temporal, rhetorical and argumentative relations types. 
This collection of basic types is sufficient for most work. 

Secondly, WebKB can use relation signatures to give types to the undeclared 
terms used as concept types. For instance, in the top-level ontology in WebKB, 
the relation types Input, Output, Agent, Method, SubProcess and Purpose are all 
defined to have a concept of type Process as the first argument. WebKB can 
infer from this ontology that Knowledge-design must be a subtype of Process if 
used with the above relation types. 

The ontology of types becomes an important tool for a system like WebKB. 
WebKB controls type signatures and knowledge vocabulary so it needs to be a 
rich and diverse collection. In WebKB, we merged the WordNet^^ ontology — 
120,000 words linked to 90,000 concept types — into our top-level ontology. 

Subsumption is an inference that determines whether one knowledge state- 
ment can be inferred from another by specialisation or generalisation. From our 
previous example, [Cat] -> (On) -> [object] is a generalisation of 
[Cat] -> (On) -> [Table] . Likewise, [Cat] -> (On) -> [Table] ->(Attr) - [Legs : {*]-@4] , a 
specialisation of [Cat] -> (On) -> [Table] and [Cat] -> (On) -> [object] . This process 
of comparison is the main mechanism for search and inference in WebKB. When 
knowledge statements follow the same conventions they can be readily compared. 
Using a common and basic relation vocabulary is thus important. A common 
convention for primitive concepts and complex relations makes comparison more 
difficult. Take for example the sentence “Mary is 20 years old” . Following our con- 
vention a simple relation “(Chrc)” is used, [Person: "Mary"] ->(Chrc)-> [Age : @20] , 
the inverse convention would use (Age) as a relation type with the simple con- 
cept type [Integer], [Person: "Mary"] ->(Age)-> [Integer : 20]. 

We call a Document Element (DE) any textual/HTML data, for example 
a sentence, a connection, a reference to an image or an entire document. This 
definition excludes binary data but includes textual knowledge statements. We- 
bKB allows users to index any DE of a Web-accessible document with knowledge 
statements, or connect DEs from the same or different documents using relations. 
Fig. 13 shows an example of each. 

The above notations allow the statements and the indexed DEs to be in 
different documents. Thus, any user may index any element of a document on the 
Web. Fig. 11 presents a general interface for knowledge-based queries, showing 
how a document containing knowledge is loaded into WebKB. 

WebKB also allows the document owner to index an image by a knowledge 
statement directly stored in the “alt” field of the HTML “img” tag. We use this 
special case of indexation to present a simple illustration of WebKB ’s features. 
This example, shown in Fig. 14, is a good synthesis but is in no way representative 



http://meganesia.int.gu.edu.au/~phmartin/WebKB/kb/topLevelOntology.html 
http:/ /www. cogsci.princeton. edu/~wn/ 
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$ (Indexation 

(Context: Language: CG; 

Ontology : http : //www.bar . com/topLevelOntology .html ; 

Repr_author: phmartin; Creation_date : Mon Sep 14 02:32:21 1998; 
Indexed_doc : http://www.bar.com/exainple.html; ) 

(DE: {2nd occurence} the red damaged vehicle ) 

(Repr: [Color: red] <- (Color) <- [Vehicle] ->(Attr) -> [Damaged] ) )$ 

$ (DEconnection 

(Context: Lauiguage: CG; 

Ontology : http : //www.bar . com/topLevelOntology .html ; 

Repr_author: phmartin; Creation_date : Mon Sep 14 02:53:36 1998;) 
(DE: {Document: http://www.bar.com/example.html} ) 

(Relation: Summary) 

(DE: {Document: http//www.bar . com/example .html} 

{section title: Abstract}) )$ 



Fig. 13. A language for knowledge indexing or connecting any Web-accessible 
document element. 



of the general use of WebKB — it is not representative because it mixes the 
indexed source data (in this case, a collection of images), their indexation, and 
a customised interface to query them, in a single document. Typically, these 
elements would be split into separate documents. The result of the query shown 
in Fig. 14 is displayed in Fig. 15. 

Because WebKB proposes knowledge representation and query commands, and 
a script language, we have not felt the need to give it a lexical and structural 
query language as precise as Harvest, WebSQL or WebLog. Instead, we have 
implemented Unix-like text processing commands for exploiting Web-accessible 
documents. The command list includes: cat, grep, fgrep, diff, head, tail, 
awk, cd, pwd, wc and echo. A hyper-link path-exploring command 
“accessibleDocFrom” is also provided. This command lists the documents di- 
rectly and indirectly accessible from given documents within a maximal number 
of hyper-links. For example, the following command lists the HTML documents 
accessible from http://www.foo.bar/foo.html (maximum 2 levels) including the 
string “knowledge” in their HTML source code. 

accessibleDocFrom -maxlevel 2 

-HTMLonly http://www.foo.bar/foo.html I grep knowledge 

WebKB includes commands for displaying specialisations or generalisations 
of concept and relation types or of an entire CG. At present, queries for CG 
specialisations retrieve only connected CGs: the processor cannot retrieve paths 
between concepts specified in a query. If a retrieved CG indexes a document 
element, it can be presented instead of the CG (Fig. 15 gives an example). 
In both cases, hyper-links are generated to reach the source of each answer 
presented in its original document. 
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Fig. 14. Images, knowledge indexations and a customised query interface con- 
tained within a same document. The example query shows how the command 
“spec” which looks for specialisations of a CG can be used to retrieve images 
indexed by CGs. The results are shown in Fig. 15). 



Specialisation gives the user freedom in the way queries can be formulated: 
searches may be done at a general level and subsequently refined according to 
the results. However, the exact names of types must be known. WebKB allows 
the user to supply only a substring of a type in a query CG, if prefixed by 
the wildcard character %. WebKB replaces the substring by declared types 
including the substring. Replacements violating relation signatures or individual 
types are discarded. For example, spec ["/.thing] will trigger the generation and 
execution of spec [Something] , spec [thingone] and spec [thingtwo] . 

Knowledge query commands may be combined with the WebKB script lan- 
guage to generate complex documents, perform consistency tests on knowledge 
bases, or solve problems procedurally. The WebKB site provides many exam- 
ples of queries and scripts. For example, one script solves a classical resource 
allocation problem, the Sisyphus-I room allocation problem^^. 



12 
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Fig. 15. The document generated in response to the query in Fig. 14. 



One way to generate new knowledge in WebKB is by joining CGs. Various 
kinds of joins over CGs may be used but WebKB only proposes joins which, 
given a set of CGs, create a new CG specialising each of the source CGs. The 
result is inserted in the CG base although it may not represent anything true for 
the user. This does however provide a device for accelerating knowledge represen- 
tation. For instance, CGs related to a type may be collected and automatically 
merged via a command such as this one: spec [TypeX] I max join. The result 
may then serve as a basis for the user to create a type definition for TypeX. 
The following is a concrete example for the maximal join command. 

> maxjoin [Cat] ->(0n) -> [Mat] [Cat : Tom] -> (Near) -> [Table] 

[Cat: Tom]- {. (On)->[Mat]; 

(Near) -> [Table] ; ]■ 

Ontology servers, such as the Ontolingua ontology server^^ and Ontosaurus^"', 
support shared knowledge repositories. However, Ontology servers are not usable 



http://WWW-KSL-SVC. Stanford. edu:5915/ 
http://www.isi.edu/isd/ontosaurus.html 
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for managing large quantities of knowledge and, apart form AI-Trader [23], they 
do not allow indexation and retrieval of parts of documents. Finally, support of 
cooperation between the users is essentially limited to consistency enforcement, 
annotations and structured dialogues, as in APECKS^®, Co4^® and Tadzebao^^. 

We are extending WebKB to handle a knowledge repository. We are con- 
sidering five issues which address scalability: (i) the implementation of the know- 
ledge-based system reuses FastDB^®, a scalable multi-user persistent object repos- 
itory, and (ii) algorithms allowing the exploitation of large-scale dynamic tax- 
onomies efficiently^®; (iii) visualisation techniques (handling term-aliases and 
view generation) to avoid lexical conflicts and enable users to focus on certain 
kinds of knowledge; (iv) protocols allowing users to solve semantic conflicts via 
the insertion of new terms and relations in the common ontology and, in some 
cases, the knowledge of other users; (v) conventions to improve the automatic 
comparison of knowledge by different users. 

6 Conclusion 

The purpose of this paper has been to profile our research involving concept (ual) 
graphs. There are some general recommendations that can be drawn: (i) useful 
intelligent systems result where general frameworks instantiate real application 
problems; (ii) assumptions about user intentions are minimised when the human 
operator is placed at the centre of the information system. In this way the com- 
binational complexity of an information system can also be minimised; (iii) this 
in turn facilitates information flows. The hypothesis of our work is that real 
progress in intelligent system results from this these three recommendations. 

To this end we have demonstrated a number of intelligent computer systems: 
a document filtering and retrieval system for medical texts, a simular system for 
the recovery and analysis of email, a system called CGPCF that fuses formal 
concept analysis with relational databases and conceptual graphs and our Web- 
accessible precision-based information retrieval system, WebKB. 
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Abstract. In this paper, we discuss Gonceptual Knowledge Discovery in 
Databases (GKDD) in its connection with Data Analysis. Our approach 
is based on Formal Goncept Analysis, a mathematical theory which has 
been developed and proven useful during the last 20 years. Formal Gon- 
cept Analysis has led to a theory of conceptual information systems which 
has been applied by using the management system TOSGANA in a wide 
range of domains. In this paper, we use such an application in database 
marketing to demonstrate how methods and procedures of GKDD can 
be applied in Data Analysis. In particular, we show the interplay and 
integration of data mining and data analysis techniques based on For- 
mal Goncept Analysis. The main concern of this paper is to explain how 
the transition from data to knowledge can be supported by a TOSGANA 
system. To clarify the transition steps we discuss their correspondence to 
the five levels of knowledge representation established by R. Brachman 
and to the steps of empirically grounded theory building proposed by 
A. Strauss and J. Corbin. 
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1 Conceptual Knowledge Discovery in Databases 

Conceptual Knowledge Discovery in Databases (CKDD) has been developed in 
the field of Conceptual Knowledge Processing. Based on the mathematical theory 
of Formal Concept Analysis, CKDD aims to support a human-centered process of 
discovering knowledge from data by visualizing and analyzing the formal concep- 
tual structure of the data. Implementing the basic methods of Formal Concept 
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Analysis, the management system TOSCANA has been used as a knowledge dis- 
covery tool in various research and commercial projects (cf. [35]). The general 
approach of CKDD and the qualities of TOSCANA as a KDD support tool have 
previously been discussed in [27] with respect to Brachman and Anand’s fun- 
damental requirements for knowledge discovery support environments (cf. [4]). 
Therefore, the basic notions and the philosophical background of CKDD are only 
briefly summarized in this paper. For a comprehensive presentation of the math- 
ematical foundations of Formal Concept Analysis see [10]; basics of Conceptual 
Knowledge Processing are explained in [31], [32], [33], [35]. 

The overall theme and contribution of the volume “Advances in Knowledge 
Discovery and Data Mining}' [7] is a process-centered view of KDD considering 
KDD as an interactive and iterative process between a human and a database 
that may strongly involve background knowledge of the analyzing domain expert. 
In particular, R. S. Brachman and T. Anand [4] argue in favor of a more human- 
centered approach to knowledge discovery support referring to the constitutive 
character of human interpretation for the discovery of knowledge and stressing 
the complex, interactive process of KDD as being led by human thought. 

Following Brachman and Anand, CKDD pursues a human-centered approach 
to KDD based on a comprehensive notion of knowledge as a part of human 
thought and argumentation. The landscape paradigm of knowledge underlying 
CKDD is based on the pragmatic philosophy of Ch. S. Peirce [16] where knowl- 
edge is understood as always being incomplete, formed and continuously as- 
sured by human discourse within an intersubjective community of communica- 
tion (cf. [35]). Emphasizing the intersubjective character of knowledge, CKDD 
considers knowledge communication as an important part of the overall discov- 
ery process with respect to both the dialog between user and system, and also as 
a part of human communication and argumentation. Therefore, a major focus of 
CKDD is to provide knowledge discovery support that guarantees a high trans- 
parency of the discovery process and a representation of its (interim) findings 
to support human argumentation and establishment of intersubjectively assured 
knowledge. CKDD especially supports a wide-ranging and unpredictable interac- 
tive exploration of the data (“data archaeology”, cf. [5]) where the software tools 
TOSCANA and Chianti serve as a knowledge discovery support environment 
in which CKDD applications can be efficiently implemented (see [27]). 

2 Conceptual Data Analysis 

CKDD is based on methods and procedures of Conceptual Data Analysis that 
allow the analysis of given data by examination and visualization of their con- 
ceptual structure. The derived graphical representations have proven to be useful 
for making the data communicable in addition to identifying conceptual relation- 
ships in the data. Knowledge is discovered in interaction with the data during 
an iterative process which activates techniques of Conceptual Data Analysis and 
is guided by theoretical preconceptions and declared purposes of the domain 
expert. In the following paragraphs, we briefly introduce the basic notions and 
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procedures of Conceptual Data Analysis using an application in database mar- 
keting. 

Based on a philosophically grounded formalization of concept (see [34]), Con- 
ceptual Data Analysis allows data to be mathematically treated and processed. 
Formal Concept Analysis, the mathematical theory underlying Conceptual Data 
Analysis, formalizes concept and conceptual hierarchy to reflect the philosophi- 
cal understanding of a concept as a unit of thought constituted by its extension 
and its intension. The extension comprises all objects belonging to the concept 
while the intension consists of all attributes valid for those objects. To allow a 
mathematical description of extension and intension, Formal Concept Analysis 
always starts with a formal context: 

Definition 1. A formal context is a set structure K := (G,M,I) where G and 
M are sets and I is a binary relation between G and M (i. e. I C G x M ). The 
elements of G and M are called (formal^ objects and attributes, respectively, 
and gim {g, m) G I) is read: “the object g has the attribute m”. Derivations 
are defined by X' := {m G M \ Vg G X : gIm} for X C G and Y' := {g G 
G I Vm G V : gim} for Y C M. A formal concept of the formal context K zs o 
pair {A, B) with A Q G, B Q M , A = B' , and B = A' ; the sets A and B are 
called the extent and the intent of the formal concept {A,B). The subconcept- 
superconcept-relation is formalized by 



{Ai,Bi) < (A 2 , B 2 ) Ai C A 2 Bi D B 2 ). 

The set of all formal concepts of K together with the order relation < is always 
a complete lattice, called the concept lattice o/K and denoted by ®(K). 

The concept lattices can be graphically represented by line diagrams which 
have been proven to be useful representations for the understanding of conceptual 
relationships in data. Before we illustrate this by examples, we introduce the 
notion of a many-valued context as a formalization of data tables that reports, 
for objects under consideration, specific values with respect to given attributes. 
In order to obtain a concept lattice of a many-valued context, the context has to 
be formally transformed to a formal context (also called a one-valued context). 
This transformation is performed by using conceptual scales which reflect specific 
interpretations of the data. 

Definition 2. A many-valued context is a set structure K := (G, M, W, I) where 
G, M, and W are sets and I is a ternary relation between G, M, and W (i.e. 
I CGxMxW) such that {g,m,wi) G I and {g,m,W 2 ) G I always imply 
wi = W 2 - The elements of G, M, and W are called objects, attributes, and 
attribute values, respectively, and {g, m,w) G I is read: “the object g has the 
attribute value w for the attribute m”. An attribute m may be considered as a 
(partial) mapping from G to W ; therefore, m{g) = w is often written instead of 
(g,m,w) G I. A conceptual scale for an attribute m G M is a one-valued context 
Sm := {Gm, Mm, Im) with m{G) C Gm- The context Km := {G, Mm, Jm) with 
gJmTi '■ m{g)Im'n is called the realized scale for the attribute m G M . The 
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Fig. 1. Line diagrams showing the cross-selling between travel accessories, perfumery, 
and ladies’ accessories 



derived context o/K with respect to the conceptual scales := {GrmMm,Im) 
(m € M)is the formal context (G, x Mm,J) with gj{m,n) : 4=^ 

m{g)Imn; its concept lattice is considered as the concept lattice of the many- 
valued context K scaled hy the conceptual scales Sm '■= {Gm, Mm, Im) (m G 
M). A many-valued context together with a collection of appertaining conceptual 
scales with line diagrams of their concept lattices is called a conceptual data 
system. 

Conceptual data systems can be implemented with the management system 
TOSCANA (see [29]). For a chosen conceptual scale, TOSCANA presents a line 
diagram of the corresponding concept lattice indicating all objects stored in 
the database in their relationships to the attributes of the scale, thus allowing 
users to navigate through the data and to analyze specific sets of objects by 
activating scales that interpret relevant aspects of the given data. Conceptual 
data systems stored in a database and implemented with a management system 
such as TOSCANA are called conceptual information systems. 

In the following paragraphs, we illustrate how conceptual data analysis may 
be performed with a TOSCANA information system implemented to support the 
database marketing of a Swiss department store. The conceptual scales together 
with line diagrams of their concept lattices are derived from a database record- 
ing the activity of individual customers with respect to the various departments 
of the store. The analysis was undertaken to reveal potentials for cross-selling 
activities. For instance, to select the target group of a direct mail for promot- 
ing the ladies’ wear department, one may start with unfolding the cross-selling 
behavior between departments where women typically buy. 

The line diagram on the left side in Figure 1 shows the cross-selling behavior 
between travel accessories, perfumery, and ladies’ accessories. The line diagram 
represents the concept lattice of the realized scale having as formal objects all 
customers with purchases in at least one of the three departments and having 
the three formal attributes ‘purchased in travel accessories’, ‘purchased in per- 
fumery’, and ‘purchased in ladies’ accessories’ while the binary relation records 
who bought in which department. The formal concepts of the realized scale are 
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Fig. 2. Line diagram showing sales in women’s clothing accrued by perfume and ladies’ 
accessories customers 



represented in the diagram by the little circles. The name of a formal object 
g is always attached to the circle representing the smallest concept with g in 
its extent (denoted by 'jg); dually, the name of a formal attribute m is always 
attached to the little circle representing the largest concept with m in its intent 
(denoted by g.m). This labelling allows to read the context relation from the 
diagram because of gim 4=^ jg < gm, in words: 

The object g has the attribute m if and only if there is an ascending path 
of line segments from the circle labelled with the name of g to the circle 
labelled with the name of m. 

The extent and intent of each concept {A, B) can also be recognized because 
A = {g G G \ ^g < {A, B)} and B = {m G M \ {A, B) < p,m}. The line diagrams 
in this paper show instead of the object names only the number of those names 
attached to the appertaining circle. Therefore, the diagram shows that there 
were 1075 customers who bought travel accessories only, 8182 perfumes only, and 
3964 ladies’ accessories only, but nothing in either of the other two departments. 
Furthermore, there were 967 customers who purchased travel accessories and 
something from perfumery but no ladies’ accessories, and 1849 customers who 
were active in all three departments. From the diagram questions naturally arise, 
for example, why do 8182 customers buy perfumery goods but no travel or ladies’ 
accessories even though both departments are right next to each other? 

For the forementioned mailing select to promote sales in ladies’ clothing, 
interesting are the 6474 + 1849 = 8323 customers because, in general, it is 
easier to develop active customers into better customers. The diagram on the 
right hand side in Figure 1 represents the same facts as the left one, but the 
number of customers are summed from the bottom up. To study the group of 
perfume and ladies’ accessory buyers in further detail, TOSCANA allows users to 
’’zoom into” the circle in the right diagram representing the 8323 customers who 
bought perfumery goods, ladies’ accessories and, in some cases, travel accessories. 
Figure 2 shows a segmentation of those customers with respect to their previous 
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Fig. 3. Nested line diagram combining numbers of visited departments with the cross- 
selling between Housewares and Interior 



activity in the ladies’ wear department (formal, business, and casual wear). In 
this diagram, the number of customers are again summed from the bottom up; 
for instance, there are 1777 customers in the group of 8323 who spent more than 
400 SFr for women’s clothing, 639 who spent more than 1000 SFr, and 1138 
who spent between 400 and 1000 SFr. The customers with low or no activity in 
ladies’ wear were chosen as the targets of the mailing select, as the rest of the 
customers were identified as already being good ladies’ wear customers. 

In Figure 3 the activity of the 6546 customers with 400 or less sFr sales 
of women’s clothing is shown. The nested line diagram presents two aspects of 
the activity of the 6546 customers: the line diagram representing the number 
of departments in which customers shopped (outer part) is combined with the 
cross-selling line diagram between housewares and interior (inner part). The 
circles of the first line diagram have been enlarged so that a copy of the second 
line diagram could be drawn in each enlarged circle. The nested line diagram 
can be read like an ordinary one if we replace the lines beween the large circles 
by parallel lines between the correspondeng circles of the inner diagrams. For 
instance, we can read from the diagram that there are 4720 customers who 
shopped in 5 or more but less than 13 departments of the store, and that 2001 
of those bought housewares as well as interiors which seems to be a good target 
group for a direct mailing. 

The examples should have made it clear that a TOSCANA information sys- 
tem enables an interactive and iterative process of conceptual data analysis lead- 
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ing to useful knowledge. The experiences with many TOSCANA systems have 
shown that domain experts are mostly stimulated by navigating through the 
graphical representations because they have a rich background knowledge about 
the appertaining domain and special interests for activating substantial ques- 
tions. The process of knowledge discovery with TOSCANA systems is always 
accompanied by a learning process which increases the ability of the user to bet- 
ter understand the goals and possibilities of the specific exploration procedure. 
All these are reasons for viewing TOSCANA information systems as human- 
centered support of knowledge discovery, as Brachman and Anand advocated 
in [4]. 



3 Prom Data to Knowledge 

In the previous section it is demonstrated through examples of conceptual data 
analysis how a conceptual information system may function as a knowledge dis- 
covery support environment that promotes human-centered discovery processes. 
In this section we want to explain in general the transition from data to knowl- 
edge for the discovery processes supported by a TOSCANA system. To clarify 
the transition steps from data (understood as symbolic representation of reali- 
ties) to human knowledge, we call upon an analysis of knowledge representations 
in semantic networks performed by R. Brachman [3] who identified the following 
five representation levels (cf. [14]): 

— Implementational Level: The primitives are nodes and links where links are 
merely pointers and nodes are simply destinations for links. On this level, 
there are only data structures from which logical forms can be build. 

— Logical Level: The primitives are logical predicates, operators, and proposi- 
tions together with a structured index over those primitives. On this level, 
logical adequacy is responsible for meaningfully prestructuring knowledge. 

— Epistemological Level: The primitives are conceptual units, conceptual sub- 
pieces, inheritance and structuring relations. On this level, conceptual units 
are determined by their inherent structure and their interrelationships. 

— Conceptual Level: The primitives are word senses and case relations, object- 
and action-types. On this level, small sets of language-independent concep- 
tual elements and relationships are fixed and from which all expressible con- 
cepts can be constructed. 

— Linguistic Level: The primitives are arbitrary concepts, words, and expres- 
sions. On this level, the primitives are language-dependent, and are expected 
to change in meaning as the network grows. 

The grading of the levels, from implementational to linguistic, orders the 
representations from simple and abstract to complex and concrete; hence the 
grading should not misunderstood as a chronological ordering, although there 
are connections between the grading and the course of the transition from data 
to knowledge. In the following, the representation levels shall be characterized 
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according to their functionalities for supporting the process from data to knowl- 
edge as performed by a TOSCANA information system. 

On the implementational level, the basic data structures are defined as one- 
and many-valued contexts. Already on this elementary level, there are instances 
for establishing connections to human knowledge, namely the formal objects, at- 
tributes, and attribute values of the contexts and the incidence relations between 
those elements. On this level, data contexts are merely considered as formal set 
structures without any content. Implementational issues for TOSCANA systems 
are discussed in [28] in detail. 

On the logical level, names for the formal objects, attributes, attribute values, 
and incidence relations are formally taken as logical predicates which allow the 
composition of further predicates by logical connectives and quantifiers. Syntax 
and formal contextual semantics of those predicates have been elaborated to 
the so-called Terminological Attribute Logic (see [18], [11]) and Terminological 
Concept Logic (see [2]) which are both related to description logics. Both termi- 
nological logics may assist the formation of abstract scales for the methods of 
conceptual, relational, and logical scaling (see [17], [19]). The management sys- 
tem TOSCANA allows the activation of used logical expressions by representing 
them as SQL-queries. The combination of abstract scales to larger contexts is 
also performed on the logical level, namely by various context constructions; the 
mostly used context construction is the semiproduct which is basic for ‘plain 
conceptual scaling’ (see [9]), and the apposition which underlies the nested line 
diagrams used by TOSCANA as exemplified in Section 2 (see [29]). 

The epistemological level addresses “the possibility of organizations of con- 
ceptual knowledge into units more structured than simple nodes and links or 
predicates and propositions” [3]. Formal concepts are indeed more internally 
structured than just a node or a predicate: they unify an object set (the extent) 
and an attribute set (the intent) so that each of these parts determines the other. 
Furthermore, the internal structure of the formal concepts gives rise to a con- 
ceptual hierarchy which mathematically forms a complete lattice if the formal 
concepts are those of a given formal context. Thus, the rich mathematical theory 
of Formal Concept Analysis (see [10]) yields a substantial contribution to Brach- 
man’s epistemological level. As Formal Concept Analysis is founded on lattice 
theory, lattice constructions and decompositions can be activated for establish- 
ing more complex concept hierarchies out of simpler ones, and, vice versa, for 
reducing complex concept hierarchies to simpler ones. Constructions like (sub-) 
direct products and tensor products of concept lattices and decompositions like 
subdirect and atlas decompositions have been successfully applied in data analy- 
sis and knowledge processing. For supporting the process of knowledge discovery, 
the visualization of concept lattices and their constructions and decompositions 
by specific line diagrams are of great importance. Those visualizations (also be- 
longing to the epistemological level) are able to stabilize knowledge acquisition 
and communication (cf. [32]). 

On the conceptual level, word senses are represented by the context attributes 
which lead to a contextual representation of concept intensions. As primitive case 
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relations, there are defined four basic relations: an object has an attribute, an 
object belongs to a concept, a concept abstracts to an attribute, and a concept is 
a subconcept of another concept (cf. [12]). These four relations are basic for the 
knowledge representation in conceptual information systems because, together 
with the word senses, they can represent a large amount of language-independent 
knowledge structures. Such structures are the concrete scales of TOSCANA sys- 
tems which are used to capture the intensional content of an application domain 
(the extensional side of those scales are still abstract). 

On the linguistic levels TOSCANA systems work with realized scales which 
are obtained by actualizing the abstract objects of their concrete scales accord- 
ing to real data. This realization particularly allows to deduce concept graphs 
representing verbal texts (see [20]). On this level, the knowledge representation 
is language-dependent so that users of the conceptual information system can 
best activate their background knowledge and common sense. The navigation 
through the conceptual landscape of the system, visualized by labelled line di- 
agrams, can be performed successfully because the interplay between formal 
and material thinking stimulated by the diagrams gives purposeful orientations 
(cf. [35]). 

The given characterization of the five representation levels for TOSCANA 
information systems shall now be used for explaining the discovery process from 
data to knowledge. This process can be seen in correspondence with the process 
of empirically grounded theory building proposed by A. Strauss and J. Corbin in 
[22] (see also [21]). According to Strauss and Corbin (p.57), empirically grounded 
theory building starts from data which are broken down, conceptualized, and put 
back together in new ways to generate a rich, tightly woven, explanatory theory 
that closely approximates the reality it represents. Although Strauss and Corbin 
are concentrating on theory building as the most systematic way of forming, 
synthesizing, and integrating scientific knowledge, their methodology may also 
apply to structuring and explaining the discovery process from data to knowledge 
in the more general case. This shall be outlined by means of the TOSCANA 
system discussed in the previous section. 

The first step of breaking down the data is performed to establish the imple- 
mentational level: the raw data are shaped to obtain elementary data structures 
which allow further formal treatments. In the case of our example, the raw data 
are coded in a relational database as a list of purchase transactions, each de- 
scribed by the ID number of the customer, the date, the department, and the 
purchase amount. From these data, suitable many- valued contexts are derived 
and represented in a data-warehouse as, for example, a many-valued context 
with the customers as formal objects structured by the many-valued attributes 
‘department’, ‘date’, and ‘purchase amount’. Establishing one- and many-valued 
contexts is a first move toward a conceptualization of the data. 

The next step of conceptualization is, according to Strauss and Corbin, con- 
cerned with categorization. For TOSCANA systems, categorization is performed 
by methods of conceptual, relational, and logical scaling which, on the logical 
level, are only understood formally. In Figure 2, an example of a conceptual scale 
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is shown having formal attributes described by formal expressions which can be 
represented by SQL-queries in the management system TOSCANA. The appo- 
sition construction yielding the nested line diagram in Figure 3, which enlarges 
the attribute categorization, also belongs to the logical level. 

The formal conceptualization is fully elaborated on the epistemological level. 
The concept lattices and the line diagrams as abstract structures are located 
on this level such as the formal procedures which make those lattices and dia- 
grams to a successful support of knowledge acquisition and communication. The 
categorization leading to attributes of an abstract conceptual scale are now em- 
bedded into the significantly richer structure of the concept lattice of the scale 
which becomes human readable by a suitable line diagram. The richness of in- 
formation given by such graphical representation may be seen in Figure 3; the 
nested structure shown in this figure reflects a subdirect product construction 
of the two combined concept lattices. 

On the conceptual level the formal structures of the first three levels receive 
intensional meaning. For instance, the attribute names in Figure 1 are (on this 
level) understood by their literal meaning; thereby, the intensions of a repre- 
sented concept can be described by combining all those meanings which belong 
to the attribute names attached to its superconcepts. Since the numbers in Fig- 
ure 1 come from actual customers, they obtain their full meaning, discussed in 
Section 2, only on the linguistic level. On the conceptual level the concept lattices 
in Figure 1 represent a concrete scale which, according to Strauss and Corbin, 
may be understood as a intensionally determined dimension for the data to be 
analysed. 

The full support for knowledge discovery is given on the linguistic level where 
the formal objects also carry meaning and, therefore, the formal concepts can 
unify intensional and extensional meaning. Of course, if further customers are 
considered in the presented example then the extensional meaning may change 
(although the intensional meaning of the concrete scales keeps the same). On 
this level, we can produce substantial interpretations of the data by suitable 
comparisions using nested line diagrams as in Figure 3; these diagrams corre- 
spond to the axial coding of Strauss and Corbin. Clearly, the rich, tightly woven, 
suggestive landscape of concept lattices that closely approximates the reality it 
represents, can serve through its representation by a TOSCANA information 
system, as a stimulating knowledge discovery support environment. 



4 Procedures of Conceptual Knowledge Discovery 

In most applications, classical data analysis and decision support facilities (for in- 
stance Online Analytical Processing (OLAP) or statistical packages) are already 
present when data mining tools are added to the knowledge discovery support 
environment . For supporting the analyst in the overall process of human-centered 
knowledge discovery, both decision support and data mining tools should pro- 
vide a homogeneous environment. In particular, this shows the need of a unified 
knowledge representation. In conceptual information systems, concept lattices 
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are used as such a unified knowledge representation. TOSCANA information 
systems have shown their use for data analysis in over 30 implementations. The 
relationship between conceptual information systems and Online Analytical Pro- 
cessing is discussed in [23] . 

In the first part of this section, we show how data analysis and data mining 
techniques based on Formal Concept Analysis may support each other. In the 
second part, we go one step further: there, we present Chianti, a new tool 
that integrates data mining and data analysis in the framework of Conceptual 
Knowledge Discovery (CKDD). 



4.1 Interplay of Data Analysis and Knowledge Discovery: 
Association Rules and Ftequent Concept Lattices 

In this subsection, we discuss how Formal Concept Analysis may support the 
mining of association rules, and how, vice versa, results of association rules min- 
ing may be used for decreasing the complexity of the visualization of traditional 
data analysis within conceptual information systems. Association rules are state- 
ments of the type ‘37 % of the customers buying coffee also buy milk’. The task 
of mining association rules is to determine all rules that have a certain confi- 
dence (37 % in the example) and a certain support (the percentage of customers 
buying coffee and milk). Mining association rules can nowadays be considered 
as one of the core tasks of KDD. Algorithmic aspects of mining association rules 
within the framework of Formal Concept Analysis are discussed in more detail 
in [15] and [30]. 



Improving the mining of association rules by using Formal Concept 
Analysis techniques. In terms of Formal Concept Analysis, the problem is the 
following: Let K := (G, M, I) be a formal context (for instance, G could be the set 
of transactions registered during a certain time period in the department store, 
M the set of products (or items) sold by the store, and {g, m) G I means that item 
m was purchased in transaction g). Each subset A of M is called an itemset. The 
support of X is defined by supp(A) := -Ij^. An association rule A — i K consists 
of two subsets A and Y of M. We say that the rule A — i A holds with support 
supp(A —I Y) := ^ and with confidence conf(A — i Y) := 

(in short: A Y with s := supp(A —I Y) and c := conf(A — i A)). The 
task is now to compute, for given minsupp, minconf G [0, 1], all association rules 
A A with s > minsupp and c > minconf 

The notion of association rules and their application to large databases was 
introduced by R. Agrawal, T. Imielinski, and A. Swami in [1]. They stated the 
problem and provided a first algorithm. Now there are several algorithms for 
mining association rules in the literature, see for instance [15] for details. 

Rules that hold only with a certain confidence have been investigated be- 
fore by many researchers. For instance, in the framework of Formal Concept 
Analysis, M. Luxenburger [13] has called them partial implications. They are a 
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generalization of implications which play an important role in Conceptual Data 
Analysis based on Formal Concept Analysis. Implications are association rules 
which hold for all objects but have no restriction on the support, i.e., they are 
exactly the association rules with minconf= 1 and minsupp = 0. 

One problem in presenting the mined association rules to the user is that 
they usually form a long list, from which only very few are of interest to the 
domain expert. Using the following theorem ([15,26]) one can reduce the list 
without losing any information: 

Theorem. Let X,Y C M. Then X y and X" y" have the same 
support and the same confidence. 

It is based on the fact that, for any frequent itemset T, the smallest con- 
cept intent which contains Y (i.e., Y") has the same support and hence is also 
frequent. For the development of algorithms, this property permits the consider- 
ation of only concept intents (instead of all itemsets) for determining the set T 
of frequent itemsets [15,30]. Especially in strongly correlated data, the algorithm 
can thereby skip many itemsets. 

Using the theorem, one can present a significantly shorter list of association 
rules without loosing any information. The list is composed of the so-called 
Duquenne-Guigues basis for exact association rules and the Luxenburger basis 
for approximate association rules. Both bases are introduced in [30], together 
with algorithms for their computation. 

Reducing the complexity of data visualization in conceptual informa- 
tion systems by using results from association rule mining. For exam- 
ining cross-selling (cf. Section 2), the concepts having many attributes - and 
hence only relatively few objects! - are of special importance. In those cases, one 
needs the whole line diagram for an analysis of how well cross-selling works. But 
there are many applications where concepts which differentiate the population 
too much are not interesting - at least not for a first overview. In that situa- 
tion, frequent concepts, as defined above, can be utilized. By fixing a threshold 
minsupp, all infrequent concepts of the conceptual scale can be pruned. Then, 
only the frequent concepts are displayed. For instance, if we want to have a first 
glance at the distribution of the age of the customers, then the conceptual scale 
‘Age’ may be too detailed. By fixing minsupp := 25%, we prune 18 of the 30 con- 
cepts of the scale ‘Year of Birth’. The remainder is shown in Figure 4. Two facts 
can be easily seen a) the birthyear of more than half the credit card customers is 
unknown, and b) 4690 of all credit card customers were born before 1973. Hence, 
there are very few customers with a known birthyear who are younger than 25 
and have paid with a credit card. 

4.2 Integration of Data Analysis and Knowledge Discovery: 

Guided Learning 

In the expression supervised learning (as a task of Machine Learning), ‘learning’ 
is used in a metaphorical way. One expects the software to find an intensional 
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Fig. 4. Conceptual Scale ‘Year of Birth’ restricted to frequent concepts with minsupp = 
25% 



description of some subpopulation, based on a training set. As CKDD is seen as 
a human-centered knowledge discovery process, our aim is to support the learn- 
ing process (in its literal meaning) of a human expert. Human knowledge always 
relies on background knowledge which is formed by intersubjective argumenta- 
tion, and only part of this knowledge can be expressed explicitly. Knowledge 
which can be made explicit may be treated by procedures of Machine Learning. 
But if one considers all aspects of knowledge, then it becomes clear that learning 
can only be supported by a knowledge discovery environment, but can never be 
completely automated. 

In this setting, we understand guided learning as a technical support for the 
learning process of the human expert.^ Guided learning shall automatically lead 
the user to conceptual scales (or combinations of conceptual scales) which are 
expected to provide interesting information, combined with the freedom of nav- 
igating around. As in supervised learning, the problem we tackle is to gain more 
knowledge about a given subpopulation. The difference is that we do not neces- 
sarily require an explicit description of the behavior. For instance, we might want 
to learn (in its literal meaning) more about the differences in buying behavior 
between high- and low-spending credit card customers. 

For this purpose, we have developed the new tool Chianti, based on [24] 
and [25]. Chianti takes as input two subpopulations which are defined by SQL 
queries. In the following example, we have divided the population in two parts: 
those customers who spent more than 1000 SFr and those who spent less. This 
tool compares the distribution of the two subpopulations in all scales of the con- 
ceptual information system and returns a ranking of all scales. In the ranking, 
the scales which appear at the top are those where the distribution differs the 
most. The current implementation of Chianti provides two measures for the 
distribution: The y^-measure (hence the name of the program) and the max- 
imum norm. While the first measure takes the differences in all concepts into 

^ The expression ‘guided learning’ is also used for education and training software, 
but here we use it to show the analogy to supervised learning. 
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Skala 


Wert 


Xselling Housewares/Interior 


0,0684 


Xselling Food/Wine 


0,0570 


Xselling Travel Access./Perfumery/Ladies’ Accessories 


0,0325 


Xselling Perfumery/Housewares/Food 


0,0324 


Xselling Perfumery/Ladies’ Fashion 


0,0305 


Xselling Wine/Men’s Fashion/Perfumery 


0,0275 


Xselling Ladies’ Fashion/Men’s Fashion/Sports 


0,0229 


Xselling Sports/Children/Travel Accessories 


0,0160 


Xselling Men’s Clothing (incl. Underwear) 
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Xselling Ladies’ Wear 
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Xselling Men’s City 
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Fig. 5. Ranking of conceptual scales related to cross-selling 



account (the larger ones over proportionally), the second measure only regards 
the concept with the largest difference. This approach is useful when an easy 
interpretation of the ranking is desired. At the moment, Chianti only works on 
the contingents (this means that, for the measure, the cardinality of the concept 
extents is not used, only the number of objects which generate the concept). As 
the difference of the distributions of the two populations may be more significant 
in more general concepts (which are not necessarily generated by single objects), 
the next version of Chianti will also analyze concept extents. 

Figure 5 shows the ranking of all scales related to cross-selling for the two 
subpopulations mentioned above with the y^-measure. The scale at the top is 
the scale ‘Cross-selling houseware/interieur’ which we have already seen as inner 
scale in Figure 3. This means that among all cross-selling scales, this scale differ- 
entiates the two groups the most. The scale ‘Cross-selling Housewares/Interior’ 
also appears as topmost scale in the ranking according to the maximum norm. 

By combining the topmost scales with the scale ‘Money spent < / > 1000 
SFr’ we can analyze the distribution of the two groups in more detail. The com- 
bination of this scale together with the scale ‘Cross-selling Housewares/Interior’ 
is shown in Figure 6. In the diagram, we have set the top element of each inner 
scale to 100% in order to facilitate comparison. We see that the high-spending 
customers buy over-proportionally in the departments Housewares (265% more 
often) and Interior (322% more often). Furthermore, for this customer group, 
the cross-selling between both departments is much higher than for the rest: 
The percentage of high-spending customers who were active in both interior 
and housewares (36.98%) is much greater than that of low-spending customers 
(5.56%). 

We emphasize that — unlike many other statistical techniques — the ranking 
of the scales is not the final result, but a suggestion to the analyst of certain 
combination of scales for analyzing the situation in more detail. The ranking 
alone does not indicate that the buying behavior in the housewares department 
determines the value of the customer. In particular, it is not possible to decide 
automatically if a prominent position in the ranking indicates a cause for or a 
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Fig. 6. Customers of the Housewares department differentiated by the amount of 
money spent 



consequence of the different distribution, as is clearly demonstrated by studying 
the ranking of all the scales. The topmost scales are then all scales related to the 
amount of money spent. In those scales, one will hardly discover new insights. 
The next scale is then ‘Active Time (in days)’. This scale does not provide an 
interesting insight either, since it is intuitively clear that a typical customer 
usually spends less than 1000 SFr in a single transaction; hence to spend more 
money, he has to visit the department store more than once. The next scale then 
is the scale ‘Cross-selling Housewares/Interior’. 

The insight that the scale about the active time is not useful for this kind of 
analysis can only be gained by referring to the implicit background knowledge of 
the domain expert. A repository which stores such information explicitly cannot 
overcome the general problem. There is an almost boundless number of possible 
combinations of conceptual scales in a conceptual information system which 
cannot be conceived of in advance. However, it is promising for further research 
to consider such a repository which ‘learns’ (in the metaphorical meaning) from 
the behavior of the analyst which combinations are of interest and which are 
not. 
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Abstract. CEM is an email management system which stores its email 
in a concept lattice rather than in the usual tree structure. By using 
such a conceptual multi-hierarchy, the system provides more flexibility in 
retrieving stored emails. The paper presents the underlying mathematical 
structures, discusses requirements for their maintenance and presents 
their implementation. 



1 Motivation 

The way standard email management systems store mails is directly derived 
from the tree structure of file management systems. This has the advantage that 
trees have a simple structure which can easily be explained to novice users. The 
disadvantage is that at the moment of storing an email the user already has 
to foresee the way she is going to retrieve the mail later. The tree structure 
forces her to decide at that moment which criteria to consider as primary and 
which as secondary. For instance, when storing an email regarding the organi- 
zation of a conference, one has to decide whether to organize one’s directories like 
mineau/iccs2000/program_commitee or like conf erences/iccs/iccs2000/ 
organisation/mineau. This problem arises especially if a user cooperates with 
overlapping communities on different topics. 

In this paper, we present the Conceptual Email Manager CEM. It uses a 
formal context as its structure for storing email rather than a tree. This allows 
the user to retrieve emails via a concept lattice following different paths. For the 
example above this means that one need not decide which of the two paths to 
use for storing. For retrieving the mail later, one can consider any combination 
of the catchwords^ in the two paths. 

Concept lattices are defined in the mathematical theory of Formal Concept 
Analysis [12]. A concept lattice is derived from a binary relation which assigns 
attributes to objects. In our application, the objects will be all emails stored by 
the system, and the attributes will be catchwords like ‘conferences’, ‘mineau’, 
and ‘organisation’. We assume the reader to be familiar with the basic notions of 

^ By catchwords we mean small natural language phrases under which the user may 
meaningfully classify documents. 
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Formal Concept Analysis, and refer otherwise to [3] and to proceedings of past 
ICCS conferences. 

There are related approaches to the above stated problem. For instance 
the concept of a virtual folder was introduced in a program called View Mail 
(VM) [6]. A virtual folder is simply a collection of email documents retrieved 
in response to a query. The virtual folder concept has more recently been pop- 
ularized by a number of open source projects, e.g. [8]. Our system differs from 
those projects in both the understanding of the underlying structure via formal 
concept analysis, and the implementation. 

Our approach is also related to the library information system implemented 
in the Center of Interdisciplinary Studies at Darmstadt University of Technol- 
ogy [7]. That system is based on the management system TOSCANA for Con- 
ceptual Information Systems [II]. The retrieval component of both our system 
and the library system provide basically the same functionality. The difference 
lies in the support for the user maintaining and updating the email collection. 
This is due to the fact that, while in the library system maintenance is allowed 
only to the librarian and/or a knowledge engineer, in an email management sys- 
tem storing emails is an essential and often used feature which requires some 
semi-automatic support for an untrained user. 

In the next section, we will describe the mathematical structures of the Con- 
ceptual Email Manager. Requirements for their maintenance are discussed in 
Section 3. Issues related to an implementation of the requirements are discussed 
in Section 4. The paper is concluded by an outlook on future work. 

In this paper we endeavor to precisely define the behavior of a natural user 
interface for managing emails based on Formal Concept Analysis. Although de- 
signing the interface to exhibit simple and rational behavior to the user, the 
exact semantics with respect to the underlying program structures the reader 
will find are rather detailed. 



2 Structures Underlying CEM 

We assume that the reader is familiar with the following two basic notions of 
Formal Concept Analysis: formal context and concept lattice. Definitions and 
examples can be found in [3] or in previous ICCS proceedings. 

In this section, we describe the system on a structural level; we abstract from 
implementation details. They will be discussed in Section 3. Basically, we can 
distinguish three fundamental structures: 

1. A formal context which assigns to each email a set of catchwords; 

2. a hierarchy on the set of catchwords in order to define an information order- 
ing over the catchwords; 

3. and a mechanism for creating conceptual scales which are used within a 
graphical interface for the retrieval of emails. 



These three structures are discussed in detail in the remainder of this section. 




440 



Richard Cole and Gerd Stumme 



2.1 Assigning Catchwords to Emails 

In the conceptual email manager, we use a formal context (G, M, I) for storing 
the emails and for assigning catchwords to them. The set G contains all emails 
stored in the system, the set M contains all catchwords. For the moment we con- 
sider M to be unstructured. (In the next subsection however, we will introduce 
a hierarchy on it.) 

The relation / indicates which emails are assigned to which catchwords. In 
the example given in the introduction, the user might want to assign all the 
catchwords ‘mineau’, ‘iccs2000’, ‘program_commitee’, ‘conferences’, ‘ices’, and 
‘organisation’ to the new email. The incidence relation is generated in a semi- 
automatic process: (i) an automatic string-search algorithm may recognize words 
within sections of an email and suggest relations between the email and some 
attributes, (ii) the user may accept the suggestion or modify it, and (iii) she also 
may attach user defined attributes to the email. In Section 3, we will discuss how 
the user is supported in this assignment process. At the moment, we suppose 
that the relation is already given. 

Instead of a tree of disjoint folders and sub-folders, we consider the concept 
lattice $(G, M, /) as navigation space. The formal concepts replace the folders. 
In particular, this means that emails can appear in different concepts. The most 
general concept contains all emails. The deeper the user gets in the hierarchy, 
the more specific are the concepts, i. e., the smaller is the number of emails they 
contain. Even so the user may, using general catchwords only, still obtain a great 
search depth from the conjunctions present in the concept lattice. 



2.2 A Hierarchy on the Catchwords 

In order to support the semi-automatic assignment of catchwords to the emails, 
we additionally provide the set M of catchwords with a partial order <. For 
this subsumption hierarchy, we assume that the following compatibility condition 
holds: 

yg G G, m,n G M: (g,m) G I, m < n {g, n) G I (J) 

(i. e., the assignment of catchwords to emails respects the hierarchy on the catch- 
words). Hence for assigning catchwords to emails, it is sufficient to assign the 
most specific catchwords only. All more general catchwords will be added auto- 
matically by the system. The maintenance of the hierarchy will be discussed in 
the two following sections. 

As an example, the user may want to say that ‘ices’ is a more specific catch- 
word than ‘conferences’, and that ‘iccs2000’ is more specific than ‘ices’ (i.e., 
‘iccs2000’<‘iccs’<‘conferences’). Emails regarding the production of this paper 
are then assigned by the authors to the catchword ‘iccs2000’ only (and maybe 
additionally to catchwords like ‘cole’ or ‘stumme’, and to ‘papers’). When the 
authors want to retrieve these emails, they do not need to remember that they 
stored them under ‘iccs2000’. They will also find them under the more general 
catchword ‘conferences’. If this catchword provides a list of emails that is too 
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Fig. 1. Part of a catchword hierarchy 



long, then they can either refine the search by taking a sub-term like ‘ices’ or al- 
ternatively by adding another catchword, for instance ‘cole’. The next subsection 
describes the structures which support the user in this kind of navigation. 

While we note that it is not required by the theory that a particular structure 
be imposed on the hierarchy it is likely that the user will impose some structural 
notions on (M, <). One appealing and natural notion is to split the hierarchy 
into three parts: One part related to contents of the emails, e.g., if an email is 
related to a conference or not, if it is used for its organization, etc. A second 
part related to the sender or receiver of the email. And a third part describing 
aspects of the mailing process (whether it is an inbound or an outbound mail 
etc.). An example of a hierarchy is given in Figure 1. (The right window of the 
screenshot is explained in Section 4.) 

Even when the hierarchy imposed on the catchwords by the user is a tree, the 
resulting concept lattice — which we use as the search space — is by no means 
a forest. Consider for example the concept generated by the conjunction of the 
two catchwords ‘ICCS 2000’ and ‘conference organization’. It will have at least 
two incomparable super-concepts, namely the one generated by the catchword 
‘ICCS 2000’ and the one generated by the catchword ‘conference organization’. 
In general, all we know is that the resulting concept lattice is embedded as a 
join-semilattice in the lattice of all order ideals of (M, <) (i. e., all subsets X of 
M s.t. X G X and x <y imply y € X). ^ 

^ The use of this structure in the framework of knowledge discovery in databases is 
analyzed in more detail under the name of power scale in [5]. Refer also to the 
theorem of Birkhoff (stated for instance in [3, Theorem 39]). 
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2.3 Conceptual Ccales for Navigating through the Set of Emails 

Conceptual scaling has been introduced in order to deal with many-valued at- 
tributes. Often attributes are not one- valued, as for instance with the catchwords 
given above, but instead allow a range of values. This is modeled by a many- 
valued context. A many-valued context is roughly equivalent to a relation of a 
relational database with one field being a primary key. As one-valued contexts 
are special cases of many-valued contexts, conceptual scaling can also be applied 
to one- valued contexts in order to reduce the complexity of the visualization. 

In this paper, we only deal with one- valued formal contexts. Readers who 
are interested in the exact definition of many-valued contexts and the use of 
conceptual scaling in this more general case are referred to [3]. Applied to one- 
valued contexts, conceptual scales are used to determine the concept lattice 
which arises from one vertical ‘slice’ of a large context: 

Definition 1. A conceptual scale for a subset B C M of attributes is a (one- 
valued) formal context Sb ■= {Gb,B,b) with Gb ^{B). The scale is called 
consistent with respect to K := (G,M,I) if {g}' C\B G Gb for each g G G. For a 
consistent scale E>b, the context Sb(K) := {G, B, lr\{Gx B)) is called its realized 
scale. 

Conceptual scales are used to group together related attributes. They are de- 
termined as required by the user, and the realized scales are derived from them 
when a diagram is requested by the user. 

The Conceptual Email Manager stores all scales which the user has defined in 
previous sessions. To each scale, she can assign a unique name. This is modeled 
by a mapping. 

Definition 2. Let S be a set, whose elements are called scale names. The map- 
ping 

defines for each scale name s G S a scale §s := §„(«) . 

For instance, the user may introduce a new scale which classifies the emails 
according to being related to a conference by adding a new element ‘Confer- 
ence’ to S and by defining a(Conference) := {CKP ‘96, AA 55, KLI ‘98, Wissen ‘99, 
ICCS 2000}. 

Observe that S and M need not be disjoint. This allows for instance the 
following construction which deduces conceptual scales directly from the sub- 
sumption hierarchy: Let S := {m G M \ 3n G M:n < to}, and define, for 
s G S, a(s) := {to G M|to ^ s} (with x ^ y if and only if x < y and there is 
no z s. t. X < z < y). This means that all catchwords m G M which are neither 
minimal nor maximal in the hierarchy are at the same time considered as the 
name of scale and as catchword of another scale S„ (where m < n). In this 
paper, we will call scales constructed this way default scales. 

This last construction has first been presented in [10] for defining a hierarchy 
of conceptual scales for the library information system [7]. In [10], however, only 
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this special construction was considered. It turns out that, in general, a more 
flexible construction is desirable. In the library information system, for instance, 
one is also interested in scales for the minimal elements in (M, <). Each such scale 
Sm has as attributes the upper covers of m (i. e., all n G M with m < n). This 
construction is made possible by using the function a which we have introduced 
in this paper. 

3 Requirements of the Conceptual Email Manager 

In this section, we discuss requirements of a conceptual email manager based 
on the paradigm of Formal Concept Analysis. In the following section we shall 
explain how our implementation responds to these requirements. 

The requirements may be divided along the same lines as the underlying 
mathematical structures defined in Section 2. Briefly stated the requirements 
are: 

1. to assist the user in building, browsing and modifying the catchword 
hierarchy; 

2. to help the user modify the scale function a; 

3. to allow the user to manage the assignment of catchwords to email 
documents; and 

4. to assist the user in searching the conceptual space of emails for both 
individual emails, and also conceptual groupings of emails. 

In addition to the requirements stated above, a good email system needs to be 
able to send, receive and display emails by processing the various email formats 
and interacting with the current popular email protocols. Since these require- 
ments are already well understood and implemented by existing email programs 
they will not be discussed further in detail in this paper. 

Browsing and Modifying the Catchword Hierarchy. The catchword hier- 
archy is a partially ordered set (M, <) where each element of M is a catchword. 
Listed below are requirements related to browsing and modifying of the catch- 
word hierarchy. 

1. The program should display graphically the structure of the partial order 
(M, <). The ordering relation must be clearly evident to the user. 

2. It must be possible, via a series of graphical manipulations initiated by the 
user and implemented in the program to add and to delete elements and to 
alter the ordering relation. It should be possible to create any partial order 
within a reasonable size limit. 

Modifying the Scale Function. The user must be able to modify the scale 
function a, explained in Section 2. Therefore the tool should provide a suitable 
visualization of the function. The program must allow an overlap between the 
set S of scale names, and the set M of catchwords. 
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Managing the Assignment of Catchwords to Emails. The program should 
store the formal context (G, M, I) and ensure that the compatability condition 
(I) is always satisfied. It is inevitable that the program will have sometimes to 
modify the formal context, after a change is made to the catchword hierarchy, 
in order to satisfy the compatability condition. This modification can be made 
either automatically, or via an interactive process where the user is asked whether 
the changes should be made. 

The program must support two mechanisms for the association of catchwords 
to emails. Firstly there should be a mechanism as described in Section 2.1 by 
which emails are semi-automatically associated with catchwords based on the 
email content. Secondly the user should be able to view and modify the associ- 
ation of catchwords with emails. 



Navigating the Conceptual Space. The program should assist the naviga- 
tion of the conceptual space of the emails by drawing line diagrams of concept 
lattices arising from conceptual scales [3] . These line diagrams should extend to 
locally nested line diagrams [9,10]. The program must allow the retrieval and 
viewing of emails that form the extension of concepts displayed in these line 
diagrams. 



4 Implementation 

This section divides the description of the implementation of our conceptual 
email manager, CEM, into a structure similar to that presented in Section 3. 



4.1 Catchword Hierarchy 

Browsing the Hierarchy. The user is presented with a view of the hierarchy, 
{M, <) as a tree widget,^ shown in Figure 1. The tree widget has the advantage 
that most computer users are familiar with its operation, and that it provides 
a compact representation (in the sense of space used on the screen) of a tree 
structure. 

The catchword hierarchy, being a partially ordered set, has a more general 
structure than that of a tree. No limitation is placed by the program on the 
structure of the partial order in general. Following is a definition of the tree 
derived from the catchword hierarchy with the purpose of defining the contents 
and structure of the tree widget. 

Let (M, <) be a partially ordered set and denote the set of all sequences of 
elements from M by M* (including the empty sequence e). Then the labeled 
tree derived from the catchword hierarchy is comprised by (T, C, label) where 
T := {(mi, . . . , TO„) G M* \ rm -< mi+i, m„ G max(M)} U |e}, wi C W 2 

® A widget is a graphical user interface component with a well defined behaviour 
usually mimicking some physical object. 
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iff W 2 is a suffix of wi, and label: T \ {e} M is the function defined by 
label(mi, . . . , m„) := m„. 

Each tree node is identified by a path from a catchword to the top of the 
catchword hierarchy. Although the tree representation has the disadvantages 
that elements from the partial order occur multiple times in the tree and that 
the tree can become large, the saving of space and the regular structure are our 
reasons to prefer it to other order representations. If the user keeps the number 
of elements with multiple parents in the partial order to a small number then 
the tree is manageable. 



Modifying the Hierarchy (Af, <). The program provides four operations 
for modifying the hierarchy: insert catchword, remove catchword, insert 
ordering and remove ordering. More complex operations provided to the user, 
for example moving an item in the taxonomy, are resolved internally to sequences 
of these four operations. In this section we denote the order filter (also called 
the up-set) ofmastw := G M \ m < a;}, the order ideal (also called 
the down-set) of m as m := {x G M \ x < m}, the lower cover of m as 

<m-= {x G M \ X ^ m}, and the upper cover of m as )^rn'-= {x G M \ x >- m}. 

The operation insert catchword simply adds a new catchword to M, and 
leaves the < relation unchanged. This means that the new catchword is incom- 
parable to all other catchwords. The remove catchword operation takes a single 
parameter a G M, and simply removes a from M and ((), a) x {a})U({a} x (f a)) 
from the ordering relation. 

The operation insert ordering takes two parameters a,b G M and inserts 
into the relation <, the set (), b) x (t «)• The operation has been drawn in the 
left diagram in Figure 2 which serves as a form of Venn-Diagram for the up-sets 
and down-sets of a and b before and after the insert operation. The shading gives 
an indication of corresponding regions. 

The insertion of the ordering b < a into < will require the insertion of the 
set {g G G I (g,b) G 1} X (fa\t^) into I. The portion of M whose image under 
the relation I will require an update is the upper shaded part in the rightmost 
diagram in Figure 2. 

The operation remove ordering takes two parameters a,b G M where a is 
an upper cover of b. The remove ordering operation removes from < the set 
((i^) \ 4(^a \{^))) X ((to) \ t()^& \{a})). The right diagram in Figure 2 may 
be used to visualize the remove operation. Similarly to the insert operation, the 
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removal of the ordering b < a from < will require a re-computation of the image 
in / under the elements from {a} x ((fa) \ t(^6 \{a}))- This region has been 
shaded in the upper right of Figure 2. 



4.2 Modifying the Scale Function 

The set S of scale names, as explained in Section 2, is not necessarily disjoint 
from M, thus the tree representation of M already presents a view of a portion 
of S. In order to reduce the complexity of the graphical user interface, we make 
S equal to M. That is: all catchwords are scale names, and all scale names are 
catchwords. Such an assumption is made possible by the definition, given in 
Section 2, of the default scale for a catchword. A result of this definition is that 
catchwords with no lower covers will map, under the scale function, a, to the 
empty set. 

The function a maps each catchword m to a set of catchwords. The program 
displays this set of catchwords, when requested by the user, using a dialog (see 
Figure 3). The dialog box contains a set of catchwords available for membership 
in a{m). In Figure 3 this set of candidates has been restricted to the down-set 
of m. An icon (either a green tick or a red cross) is used to indicate membership 
(or not) in the set of catchwords given by a{m). By clicking on the icon the user 
can change the definition of a(m). 

By displaying only the down-set of m in the dialog box, the program restricts 
the definition of a to a{m) C (j, m). This restriction has an effect on the “remove 
ordering operation” defined on (M, < ) . When the ordering of a < 6 is removed 
the image of the function a for attributes in f a is automatically checked and if 
necessary modified. 

The program has an intended mode of operation for expert users in which 
the restriction on the definition of a(m) C is lifted. In this mode the user 
has all catchwords available for inclusion in a{m), and he may choose the set S 
of scale names to be different from the set M of catchwords. 

When the function a is changed by the user then the set | s G 5} of 
scales is changed automatically. This update occurs regardless of the mode of 
operation. The new/modified scales can then be used directly for navigating in 
the concept space as described in Section 4.4. 



4.3 Associating Emails with Catchwords 

Each member of (M, <) is associated with a query term, which in this application 
is a set of section/word pairs. For our purposes a section of an email is either 
a header field, e. g. the “From:” field, or the section “body” which is composed 
of the parts^ of the email directly encoding text. More formally stated: Let H 
be the set of sections found in the email documents, W the set of words found 

^ The MIME extension to the email format allows an email document to have multiple 
parts. These multiple parts are sometimes referred to as attachments. 
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Fig. 3. Dialog for editing a(Emails with Cole) 



in the email documents, then the function query: M — >■ x W) attaches to 

each attribute a set of section/ word pairs. 

Let G be a set of email documents. Five relations, Q, i?, R~ , and I 
are defined for managing the different ways in which email documents may be 
associated with catchwords. Q Q G x {H x W) is a relation between documents 
and section/word pairs. The relation member (g,(h,w)) € Q indicates that 
document g has word w in section h. Q is stored via an inverted file index 
and is only updated when new email is presented to the system. The relation 
i? C GxM is derived from the relation Q and the function query via: (g, m) C R 
iff {g, (h, w)) € Q for some (/i, w) € query(m). The relation R is only used as an 
intermediate step and is calculated from Q as required by the program. 

The relations R~^ and R~ store user judgments saying whether or not an 
email should have a catchword m. These judgments will “over-rule” the relation 
R. We impose the constraint 

((ti?+)n(|i?-)) = 0 (#) 

on the two relations R~^ and R~ , saying that a user is not allowed to contradict 
himself. I.e., he is not allowed, for m > n, to assign {g,m) to R~ and {g,n) to 
R+. 

The relation / respecting the compatibility condition (|) is derived from the 
relations R, i?'*' and R~ using the following operator: For any relation J C 
G X M, we define := {{g,m) G G x M \ 3n G M: {g,n) G J, n < m}. We 
obtain I as the incidence relation for the formal context (G, M, I) mentioned in 
Section 2 by / := ((i? \ R~) U 

These five relations are required to accommodate the different ways in which 
an email may be associated with catchwords. Q and R associate emails with 
catchwords via an automatic process based on content and queries attached to 
catchwords, R~^ and R~ associate email based on user input, and I combines 
these two sources with the hierarchy defined over the catchwords. By separating 
the relations for automatic associations of catchwords to emails from the relations 
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Fig. 4. Interface for viewing email and associating email with catchwords 



for user defined associations, the program maintains a pure keyword index into 
the email collection. Relations R and I are derived from Q, i?+, and R~ , and 
so need not be stored. Storing I however greatly reduces the time complexity of 
the program. 

When a batch of new emails, G{,, is presented to the program, the relation 
Q is updated automatically by inserting new pairs, Qb, into the relation. The 
modification of Q into QyjQb will cause an insertion of pairs Rb into R according 
to query(m) and then subsequently an insertion of new pairs R into I. The 
definitions are: 

QbCGbx{Hx W) 

Rb = {{g,m) I 3 (/i, w) G query(m) and (g, (/i, w)) G Qb} 
h = {{g,rn) I 3 mi < m with (g,TOi) G i?b} 

The user can modify the association of emails with catchwords in two ways. 
Firstly by changing the relations R~^ and R~ and secondly by making modifi- 
cations to the query function. In order to explain the user interface for making 
modifications to i?'*" and R“ we introduce the following notation. For an email 
g G G, we define the restriction of any relation J C G x M to this email by 
Jg := J (1 ({g| X M). For the purpose of brevity of expression we shall say m 
belongs to Jg if {g,m) G Jg. 

The user is able to view individual emails as shown in Figure 4. In this mode 
icons are attached to catchwords in the tree widget displayed to the left of the 
email. These icons indicate to the user how each of the catchwords is related to 
the displayed email by R, R~ , and The user is able to change the relations 
R~ and R~^ by interacting with the icons. 
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1. If TO is not in i?|, (-Rg )^, or R~ then no icon is displayed. 

2. If TO is in R| then a yellow tick (shown as white in Fig. 4) is displayed. 

3. If TO is in R~ then a red cross is displayed. 

4. If TO is in then a green (shown as black in Fig. 4) tick is displayed. 

All combinations of these icons which do not include at the same time a red 
cross and a green tick are possible. 

The user can then determine that the displayed email has a catchword in / if 
there is either a green tick or a yellow tick in the absence of a red cross. The pro- 
gram provides two basic operations, associate attribute and disassociate 
attribute from which more complex operations for use in the user interface 
may be constructed. The associate attribute operation takes two parame- 
ters, an email document, and a catchword to. The operation inserts the pair 
{g, to) into R~^ , and removes, for all n > m, (g, s) from R~ . Similarly the opera- 
tion disassociate attribute takes two parameters, an email and a catchword. 
The operation inserts {g,m) in R~ and removes, for all n < m, (g,n) from R+. 
The construction of the two operators guarantees that the constraint (#) is 
always satisfied. 

The user is also able to influence the way that R is derived from Q by 
modifying the query function. The user is able to modify a(m) using the Scale 
Query field in the dialog box shown in Figure 3. After any such modification to 
the query function the relations R and / are modified accordingly. 

New emails presented to the system for automated indexing cause a modi- 
fication to the inverted file index consisting only of new entries. The insertion 
of new email documents into an inverted file index is an efficient operation. The 
complexity of inserting each document is 0(1). When the user makes a modi- 
fication to either R~^ or R~ of a removal or insertion of {g, to) this will cause 
all catchwords in the order filter of to, or order ideal, resp., to be updated in I. 
The expense of such an update depends on how / is stored but is likely to be 
0{log(n)) where n is the average number of documents per attribute. 

It is useful for the system to maintain the relation R+ for special catchwords 
dependent on observation by the program of the users behavior. Two examples of 
such catchwords are “read emails” for emails that the user has displayed at some 
time, and “unseen emails” for emails that the user has not yet been notified of. 



4.4 Navigating the Conceptual Email Space 

To assist the user in navigating the conceptual space of emails, the program 
draws simple line diagrams and (locally) nested line diagrams. A simple line 
diagram is used to visualize a single scale, while nested line diagrams are used 
to visualize combinations of scales. The concept lattices, from which the nested 
line diagrams are drawn, are computed from the contexts given by Sq,( 5 ). The 
contexts are calculated using the algorithm reported in [1], and the concept 
lattices are calculated from these contexts via Canter’s algorithm [3]. 

The user may navigate the conceptual space of emails documents for different 
purposes: 
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Fig. 5. Concept Lattice derived from the Scale for “Conference Related”. 



1. to find collections of emails thematically linked; 

2. to review the precision and recall of queries attached to catchwords by com- 
paring them with catchwords based on user judgments (for the purpose of 
refining them for improving the query function); and 

3. to review patterns of communication between different groups. 

Of these purposes the first is the most useful to common users of the program. A 
simple scenario in which the user has this first purpose is presented here. Imagine 
a researcher who was in the Program Committee (PC) of ICCS ’97 and was at 
that time co-authoring with other members of the PC for the same conference. 
For the organization of a conference in the year 2000, she wants to retrieve 
some facts about the organization of ICCS ’97. But she only remembers that she 
exchanged this information with one of the people she was co-authoring with 
for ICCS ’97, and that it was only one tiny part of a mail covering all kinds of 
topics. 

The researcher may begin her search by requesting a line diagram for the 
scale named “Conference Related”. This scale is shown in Figure 5. It shows that 
from her 2344 emails in total, there are 222 emails related to conferences, 145 
of which are related to conferences with papers submitted and 110 of which are 
related to both conference organisation and program committees. The researcher 
decides that the email she is looking for is likely to be under the catchword 
“Conferences with Papers”. As there are too many emails in its extent to be 
read through, she may for instance want to expand the concept. By choosing the 
scale Sconferences 1997, she obtains Figure 6. 

Now the researcher can for instance check the 19 mails related to “ICCS ’97” 
and “Conference Organization/Program Committee”. If she still doesn’t find 
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Fig. 6. Concept Lattice derived from the Scales for “Conference Related” and “Con- 
ferences with Papers”. 



the email she is looking for there, then she has to check either the 86 papers 
related to “ICCS ’97” or even all 115 emails under the catchword “Conference 
Organization” . Before doing this, however, she might want to differentiate these 
concepts further, e. g. by zooming into them with the scale “Members of ICCS ’97 
Program Committee”. If this scale doesn’t exist yet, then she can create it on 
the fly using the widget for modifying the scale function (and eventually store 
it for further use). 

Note that with a classical, tree-structured search hierarchy (where one usually 
has the names of the correspondents on the highest level), one would be forced 
to scan all branches starting with the names of the co-authors before one can 
tell the system constraints like “Conference Related” . 

5 Outlook 

Having completed a prototype implementation of CEM (available on request 
from the first author, the next step is to evaluate its operation in daily use and 
to measure is scalability with respect to large data sets and distributed collec- 
tions of email. We also consider allowing the user to impose more structure on 
M including conjunctive implications and negation, following the mathematical 
foundation presented in [4] , as well as a more expressive language for the query 
function which allows for instance disjunctive queries. 

Although CEM has been in this paper applied to email documents it has a 
more general use as a document management system. The next step therefore is 
to extend the current architecture to allow the user to associate catchwords with 
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files accessible remotely via the internet and also locally with the users private 
collection. This next step has the challenge of dealing with a large number of 
protocols and file formats. The emergence of standards such as XML and RDF 
gives some hope for a general and unified method for processing of this myriad 
of data formats. Looking further ahead one can consider how a conceptual file 
management system might be used in a group environment or at an enterprise 
level where several users contribute to the structure of the hierarchy and the 
association of catchwords with files. 
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Abstract. The aim of this paper is to indicate how TOSCANA may 
be extended to allow graphical representations not only of concept lat- 
tices but also of concept graphs in the sense of Contextual Logic. The 
contextual-logic extension of TOSCANA requires the logical scaling of 
conceptual and relational scales for which we propose the Peircean Al- 
gebraic Logic as reconstructed by R. W. Burch. As graphical represen- 
tations we recommend, besides labelled line diagrams of concept lattices 
and Sowa’s diagrams of conceptual graphs, particular information maps 
for utilizing background knowledge as much as possible. Our considera- 
tions are illustrated by a small information system about the domestic 
flights in Austria. 



Contents 

1. TOSGANA 

2. Set-theoretical Semantics of Gontextual Logic 

3. Gonceptual and Relational Scales 

4. Graphical Representations 

1 TOSCANA 

TOSGANA has been developed at Darmstadt University of Technology as a 
computer program for analyzing and exploring data by methods of Formal Con- 
cept Analysis (see [KSVW94], [VW95], [GW99]). It has been used in a wide 
range of application domains such as medicine, psychology, social sciences, lin- 
guistics, information sciences, machine and civil engineering etc. (cf. [GSW98], 
[SWOO]). A typical application combines TOSGANA and a (relational) database 
to a TOSGANA-s?/stem allowing the representation, the maintainence, and the 
activation of information so that users of the system may gain actual knowledge 
about interesting aspects of the relevant application domain. For the human- 
machine interaction, TOSGANA offers labelled (nested) line diagrams of concept 
lattices, representing conceptual relationships of the stored data, and allows the 
navigation through the data by changing from one line diagram to another (and 
so on). Although the graphical representations of concept lattices have been 
proven useful in numerous applications, there are many cases in which such 
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representations of conceptual hierarchies are not sufficient and, in addition, rep- 
resentations of (non-hierarchical) relations become desirable. Those cases have 
stimulated the extension of Formal Concept Analysis to Contextual Logic, a 
formal logic semantically based on formal contexts from which concept lattices 
are derivable and also concept graphs, combining formal concepts and concep- 
tual relations (see [Wi97], [Pr98], [PW99]). The aim of this paper is to indicate 
how TOSCANA may be extended to allow graphical representations not only of 
concept lattices but also of concept graphs in the sense of Contextual Logic. 

Before discussing a contextual-logic extension of TOSCANA, we explain 
through an example how TOSCANA can be used to establish a conceptual in- 
formation system based on methods of Formal Concept Analysis. The data for 
the example, which have already been considered in [PW99], are described in 
Figure 1 by a data table about the domestic flights in Austria. For a TOSCANA- 
system, such table is usually considered as a many-valued context {G,M,W,I) 
for which G is the object set comprising all listed flights, M is the attribute set 
consisting of the attributes “Airline”, “Departure Airport”, “Departure Time”, 
“Arrival Airport” , “Arrival Time”, “Days”, and “Aircraft”, W is a set containing 
all attribute values described by the entries in the columns of the table, and / 
is the ternary relation between G, M, and W indicating which object has which 
value for which attribute. 

The first step of establishing a TOSCANA-system for the many-valued con- 
text represented in Figure 1 is to turn it into a number of formal contexts, called 
conceptual scales, grasping the information coded by the data. Such transfor- 
mation should be guided by purposes which we assume in our example to lie 
in the support of flight information. For our explanations, we choose the seven 
conceptual scales “Connections”, “Departure Time (Hours)”, “Departure Time 
(Minutes)”, “Arrival Time (Hours)”, “Arrival Time (Minutes)”, “Days”, and 
“Airline / Aircraft” . The concept lattice of the conceptual scale “Connections” is 
presented in Figure 2; it yields the information about the departure and arrival 
airports of the 75 domestic flights which are denoted by their flight number; 
for instance, the flight 1583 departs from Innsbruck and arrives at Graz (sur- 
prisingly, there is no flight from Graz to Innsbruck). The information about the 
departure time of the flights is well represented by the two interordinal scales 
“Departure Time (Hours)” and “Departure Time (Minutes)” where the first has 
the attributes < 6, . . . , < 23 and > 6, . . . , > 23 and the second has the attributes 
< 00, < 05, . . . , < 50, < 55 and > 00, > 05, . . . , > 50, > 55 (cf. [PW99], Fig. 3); for 
instance, the departure time 15.10 of flight 1583 is uniquely determined by the 
attributes < 15 and > 15 of the first scale and < 10 and > 10 by the second scale. 
The arrival time is represented analogously by two interordinal scales. The choice 
of interordinal scales for representing time has been well proved because they 
allow the expression of time intervals by formal concepts. The conceptual scale 
“Airline/ Aircraft” has the attributes “VO”, “OS”, “F70”, “DH8”, and “CRJ” 
yielding a six-element concept lattice. The concept lattice of the conceptual 
scale “Days” , restricted to the flights from Vienna to Innsbruck, is presented in 
Figure 3. 
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Flight 


Airline 


Departure 


Arrival 


Days 


Aircraft 


Airport 


Time 


Airport 


Time 


070 


VO 


Vienna 


07,50 


Innsbruck 


08.40 


1-6 


F70 


071 


VO 


Innsbruck 


06,25 


Vienna 


07,20 


1-5 


F70 


072a 


VO 


Vienna 


10,20 


Innsbruck 


11.35 


6 


DH8 


072b 


VO 


Vienna 


10,50 


Innsbruck 


12,05 


1-5,7 


DH8 


073a 


VO 


Innsbruck 


08.35 


Vienna 


09,45 


67 


DH8 


073b 


VO 


Innsbruck 


09,05 


Vienna 


09,55 


1-5 


F70 


074 


VO 


Vienna 


13.55 


Innsbruck 


15.10 


2-5,7 


DH8 


075 


VO 


Innsbruck 


11.40 


Vienna 


12,50 


1-5 


DH8 


076a 


VO 


Vienna 


17.45 


Innsbruck 


18.40 


1-6 


F70 


076b 


VO 


Vienna 


18,40 


Innsbruck 


19,55 


7 


DH8 


077 


VO 


Innsbruck 


15.35 


Vienna 


16.45 


2-5 


DH8 


078a 


VO 


Vienna 


20.35 


Innsbruck 


21,25 


1-4 


F70 


078b 


VO 


Vienna 


21,30 


Innsbruck 


22.45 


7 


DH8 


078c 


VO 


Vienna 


21.40 


Innsbruck 


22,35 


5 


CRJ 


330 


VO 


l.in? 


06,20 


Salzburg 


06,50 


1-6 


CRJ 


331 


VO 


Salzburg 


11,20 


Linz 


11.45 


1-5 


CRJ 


332 


VO 


l.in? 


16,05 


Salzburg 


16,35 


1-5 


CRJ 


333 


VO 


Salzburg 


21,50 


Linz 


22.15 


1-5,7 


CRJ 


409 


VO 


Graz 


12.10 


Linz 


12,45 


1-5 


CRJ 


410 


VO 


l.in? 


16,10 


Graz 


16,50 


1-5 


CRJ 


412 


VO 


l.in? 


10,35 


Graz 


11.10 


1-5 


CRJ 


413 


VO 


Graz 


06,15 


Salzburg 


06.50 


1-5 


CRJ 


415 


VO 


Graz 


17,30 


Salzburg 


18.10 


1-5 


CRJ 


416 


VO 


Salzburg 


21,50 


Graz 


22.25 


1-5,7 


CRJ 


417 


VO 


Graz 


17,15 


Linz 


17,45 


7 


CRJ 


501 


VO 


Klagenfurt 


06,00 


Salzburg 


06.45 


1-5 


DH8 


502 


VO 


Salzburg 


21,55 


Klagenfurt 


22,40 


1-5,7 


DH8 


531* 


VO-OS 


l.in? 


06.00 


Vienna 


06.45 


1-6 


DH8 


532* 


VO-OS 


Vienna 


10,40 


Linz 


11,20 


1-5,7 


DH8 


533* 


VO-OS 


l.in? 


08,35 


Vienna 


09,25 


1-7 


DH8 


534* 


VO-OS 


Vienna 


22,15 


Linz 


23,00 


1-5,7 


DH8 


536a* 


VO-OS 


Vienna 


17,10 


Linz 


17,55 


5 


DH8 


536b* 


VO-OS 


Vienna 


17.15 


Linz 


17,55 


1-4,7 


DH8 


537* 


VO-OS 


l.in? 


12,00 


Vienna 


12,50 


1-5,7 


DH8 


538* 


VO-OS 


Vienna 


20,30 


Linz 


21.15 


1-7 


DH8 


539* 


VO-OS 


l.in? 


18.15 


Vienna 


19,00 


1-5,7 


DH8 


540* 


VO-OS 


Vienna 


10.45 


Graz 


11.30 


1-7 


DH8 


541* 


VO-OS 


Graz 


06,05 


Vienna 


06.45 


1-6 


DH8 


542* 


VO-OS 


Vienna 


13.50 


Graz 


14.35 


1-5 


DH8 


543* 


VO-OS 


Graz 


08,50 


Vienna 


09.35 


1-7 


DH8 


544* 


VO-OS 


Vienna 


17,20 


Graz 


18,00 


1-7 


DH8 


545* 


VO-OS 


Graz 


11.55 


Vienna 


12,35 


1-5 


DH8 


546* 


VO-OS 


Vienna 


19.40 


Graz 


20.20 


1-7 


DH8 


547* 


VO-OS 


Graz 


15,30 


Vienna 


16,15 


1-5,7 


DH8 


548* 


VO-OS 


Vienna 


22,30 


Graz 


23,10 


1-5,7 


DH8 


549* 


VO-OS 


Graz 


18,25 


Vienna 


19,05 


1-5,7 


DH8 


550* 


VO-OS 


Vienna 


07,25 


Klagenfurt 


08,15 


1-5 


DH8 


551* 


VO-OS 


Klagenfurt 


06,00 


Vienna 


06,50 


1-6 


DH8 


552* 


VO-OS 


Vienna 


10,40 


Klagenfurt 


11.30 


1-7 


DH8 


553* 


VO-OS 


Klagenfurt 


08.40 


Vienna 


09.30 


1-7 


DH8 


554* 


VO-OS 


Vienna 


13.55 


Klagenfurt 


14.50 


1-5 


DH8 


555* 


VO-OS 


Klagenfurt 


11,55 


Vienna 


12,45 


1-7 


DH8 


556* 


VO-OS 


Vienna 


17,10 


Klagenfurt 


18,00 


1-7 


DH8 


557* 


VO-OS 


Klagenfurt 


15,15 


Vienna 


16,10 


1-5 


DH8 


558* 


VO-OS 


Vienna 


19.50 


Klagenfurt 


20.45 


1-7 


DH8 


559* 


VO-OS 


Klagenfurt 


18,20 


Vienna 


19.10 


1-7 


DH8 


560* 


VO-OS 


Vienna 


22,30 


Klagenfurt 


23,20 


457 


DH8 


561* 


VO-OS 


Klagenfurt 


21,00 


Vienna 


22.00 


457 


DH8 


590* 


VO-OS 


Vienna 


10,25 


Salzburg 


11.20 


1-7 


DH8 


591* 


VO-OS 


Salzburg 


17.15 


Vienna 


18,10 


7 


DH8 


593* 


VO-OS 


Salzburg 


08,15 


Vienna 


09,15 


1-7 


DH8 


594* 


VO-OS 


Vienna 


17.35 


Salzburg 


18,35 


1-7 


DH8 


595* 


VO-OS 


Salzburg 


11.45 


Vienna 


12,40 


1-7 


DH8 


596a* 


VO-OS 


Vienna 


20,25 


Salzburg 


21.20 


6 


DH8 


596b* 


VO-OS 


Vienna 


20,35 


Salzburg 


21.30 


1-5,7 


DH8 


597* 


VO-OS 


Salzburg 


19,05 


Vienna 


20.00 


1-7 


DH8 


1557 


VO 


Klagenfurt 


16,00 


Vienna 


16,50 


7 


DH8 


1583 


VO 


Innsbruck 


15.10 


Graz 


15,55 


7 


CRJ 


1596 


VO 


Vienna 


14,05 


Salzburg 


15,05 


5 


DH8 


2980 


VO 


Innsbruck 


06.10 


Salzburg 


06.40 


1-7 


DH8 


2981 


VO 


Salzburg 


12,30 


Innsbruck 


13,00 


1-7 


DH8 


2983 


VO 


Salzburg 


16.40 


Linz 


17,05 


1-5 


CRJ 


2984 


VO 


Innsbruck 


14.35 


Salzburg 


15.10 


1-7 


DH8 


2985 


VO 


Salzburg 


21.40 


Innsbruck 


22,05 


1-7 


DH8 


2986 


VO 


Innsbruck 


10,20 


Salzburg 


10.55 


7 


DH8 



Fig. 1. A data table about the domestic flights in Austria 
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Fig. 2. The concept lattice of the conceptual scale “Connections’ 
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Fig. 3. The concept lattice of the conceptual scale “Days”, restricted to the flights 
from Vienna to Innsbruck 



The conceptual scales of a TOSCANA-system, each given by a formal con- 
text together with a line diagram of its concept lattice, are viewed on three levels 
of abstraction to allow high flexibility of their use: First, a conceptual scale is 
considered as an abstract scale having only (clarified) abstract objects and at- 
tributes without a particular meaning. Secondly, a conceptual scale is considered 
as a concrete scale having still abstract objects but meaningful attributes with 
respect to an application domain. Thirdly, a conceptual scale is considered as a 
realized scale having now both: meaningful objects and attributes with respect 
to an application domain. Abstract scales may be used in different applications, 
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Fig. 4. The basic architecture of a TOSCANA-system 



while concrete scales are designed for a specific application, but allow differ- 
ent realizations concerning the set of objects (for instance, concerning actual 
revisions of the flights in the database on domestic flights in Austria). 

The basic architecture of a TOSCANA-system is described in Figure 4. Ac- 
cording to that, a TOSCANA-system divides into a data and a program part. 
In the data part, the information about the objects of the application domain is 
coded in the (relational) database, while the domain-specific concrete scales (in- 
cluding line diagrams of their concept lattices) and SQL-queries for the objects 
of the application domain are collected in the conceptual scheme; in our example 
this means that the data table of Figure 1 is stored in the database and that the 
relevent seven conceptual scales, understood as concrete scales, are available in 
the conceptual scheme of the data part. For the represention of concrete scales 
with line diagrams of their concept lattices in the conceptual scheme, the draw- 
ing program Anaconda yields a coding by the description language ConScript 
(see [Vo96]). The drawings of the concept lattices has to be provided in advance 
since, in general, well readable line diagrams cannot be generated automatically. 

The program part consists of TOSCANA and a relational database manage- 
ment system (RDBMS). At runtime, TOSCANA loads the conceptual scheme 
and connects it to a RDBMS which accesses the database in which the infor- 
mation of the application domain is stored. The system offers the user the list 
of conceptual scales, among which the user may choose one or more scales. If 
the user chooses one scale, then TOSCANA computes the realized scale and the 
labelling of the line diagram of its concept lattice by querying the actual objects 
in the database, and displays the labelled line diagram on the screen. Figure 2, 
showing the concept lattice of the realized scale “Connections” , is the result of 
such an action. If the user chooses two (or more) scales, then she can either zoom 
into a concept of the first scale to obtain a conceptual refinement of this con- 
cept by the concept lattice of the second scale, or she can represent the concept 
lattices of both scales simultaneously by a nested line diagram. Figure 3 results, 
after choosing the conceptual scales “Connections” and “Days”, from zooming 
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in Figure 2 into the ninth object node from the left. A detailed description of 
TOSCANA-systems can be found in [Na99]. 

The explanations concerning the TOSCANA-system of the domestic flights 
in Austria might convince that such an information system could be useful; for 
instance, the object nodes of Figure 2 partition the flights according to their 
departure and arrival airports and lead, after drilling down to a suitable object 
node, further information about departure and arrival time, week days, air- 
line, and even aircraft, relevant to flights from specifically chosen departure and 
arrival airports. Nevertheless, the representation of all relationships within con- 
ceptual hierarchies does not seem sufficient for human-machine communications. 
Therefore, an extension of TOSCANA should be developed which allows more 
direct representations of non-hierarchical relations. Such an extension, mathe- 
matically based on Contextual Logic, is described in the following sections. 

2 Set-Theoretical Semantics of Contextual Logic 

Contextual Logic has been developed as a mathematization of the traditional 
philosophical logic with its doctrines of concepts, judgments, and conclusions 
[WiOO]; for that, the mathematical theory of concepts is taken from Formal 
Concept Analysis, and the mathematical theory of judgments is derived from 
Sowa’s Theory of Conceptual Graphs [So84]. The set-theoretical semantics of 
Contextual Logic is based on power context families which are defined as follows: 
a power context family is a sequence K := (Kq,]Ki, . . . ,K„) (n > 2) of formal 
contexts := {Gk, Mk, h) with Gk Q (Gq)^ for k = l,...,n. The formal 
concepts of (1 < A: < n) represent by their extents fc-ary relations on the 
object set Gq; they are therefore called the fc-ary relational concepts of K. 

According to [PW99], the formal judgments, also called concept graphs, of 
a power context family can be represented (up to logical equivalence) as the 
elements of all the direct products 



n 

(k,g)eU 

for which [/ is a finite subset of Ufc=o ni^} ^ Gfc satisfying the implication 

(k,g) GU and g = {gi,...,gk) (0,gi), ...,{0,gk)GU (1 < fc < n), 

and [7fc(/) is the principal filter of $(Kfc) generated by the smallest concept 
having g in its extent. An element a := {o-(k,g) I (k,g) G U) of the product 
can be understood as the family of the atomic concept graphs : g] with 
(fc, g) G U where the so-called conceptual instance [ag : g] is the element a of the 
(one-factor) product with U := {(0,(7)} and a(o,g) = Ug and, for fc = l,...,n, 
the so-called relational instance [a^ : g] is the element a of the product with 
U ■■= {(fc,g),(0,5 i),...,( 0,5 fc)} (if g = {gi,...,gk)) and a(k,g) = o-k, a(g,gi) = 
Ug, . . . , = Ug (in general, denotes the largest concept of K^); instead 

of [afc : g] we also write (a^; [ug : g \], . . . , [ug : gk])- Notice that a pair {ak,g) 
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with ttk G Q5(Kfc) and g G Gk forms an atomic concept graph if and only if 
g G Ext{a). 

The formal judgments described in the preceding paragraph may be called 
concept graphs in atomistic normal form. From a concept graph in atomistic 
normal form one can deduce its equivalent concept graphs using the following 
equivalences (cf. [PW99], Prop. 1): 

(1) ([at : (/] \ t gT) is equivalent to [AteT ■ g], 

(2) ([a : gt] | t G T) is equivalent to [a: {gt \ t G T}], 

(3) (a; [bi:Bi],...,[bk:Bk]) 

is equivalent to ([a^ : Bi x ■ ■ ■ x Bk], [bi : Bi ], . . . , [bfe : Bk]), 

(4) ((a; [bi : B ,], . . . , [bfc : Bk]), {a; [bi : Bi], [b; : Bi])) 

is equivalent to 

((a; [bi : B ,], . . . , [bfc : Bk]), (5; [bi : Bi], . . . , [b, : B^]))/A^, j) 
if [b. : B,] = [b, : B,]. 

The equivalences (1), (2), and (3) are applied to suitable subfamilies of a family 
of elementary concept graphs to obtain another family of elementary concept 
graphs (in [PW99], an elementary concept graph is understood as a concept 
graph with at most one relational concept). The equivalence (4) allows us to 
identify equal conceptual instances related to different relational concepts, i.e., 
(4) is used to split up a concept graph in elementary concept graphs and to 
combine concept graphs to one concept graph. 

Formal judgments (i.e. concept graphs), which may be easily turned into tex- 
tual statements of “plain English” (see [ME99]), are useful for expressing specific 
information contained in a given power context family; they are the means “to 
make the data talk” . Therefore, it is desirable to derive from the data in a re- 
lational database a power context family allowing one to deduce informative 
concept graphs. For doing this, the method of relational scaling has been pro- 
posed in [PW99]: If a relational database is given in form of Codd’s relational 
model R C HteT then a derived power context family K := (Kq, Ki, . . . , K„) 
of formal contexts Kk := {Gk, Mk, Ik) (0 < fc < n) is formed by suitable subsets 
Go C IJjgjn At and Gk C (Gq)^ together with corresponding attribute sets giving 
purpose-oriented meaning to the data (the elements of Gk should be restrictions 
of elements of HtcT ^*)- 

Understanding the data table in Figure 1 as a representation of a relational 
model in the sense of Codd, we may derive a power context family (Ko,IK 2 ) by 
the following definitions of object and attribute sets (cf. [PW99]): 
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Go := {070,071,...,2986}U 

\Graz, Innsbruck, Klaqenfurt, Linz, Salzburg, Vienna\U 

{06.00, 06.05,..., 23.20}U 

11,2,3,4,5,6,7} 

G 2 ■= 1(070, Vienna), (071, Innsbruck), . . . , (2986, Innsbruck)}U 
{(070, Innsbruck), (071, Vienna), . . . , (2986, Salzburg)}U 
{(070, 07.50), (071, 06.25), . . . , (2986, 10.20)|U 
{(070, 08.40), (071, 07.20), . . . , (2986, 10.55)}U 
{(070, 1), . . . , (070, 6), (071, 1), . . . (071, 5), . . . , (2986, 7)} 

Mq := {f light, air line, air port, time, days, air era ft}U 

{Graz, Innsbruck, Klagenfurt, Linz, Salzburg, Vienna}U 

{>20min, > 25min, > 30mm, > 3km, > Akm, > 5km, > 12km, > 16km}U 

{< 06.00, < 06.05, . . . , < 23.20, a.m.,p.m.}U 

{> 06.00, > 06.05, . . . , > 23.20, a.m.,p.m.}U 

{Mo, Tu, We, Th, Fr, Sa, S'tt}U 

{VO, OS, F7Q, Dm, CRJ} 

M 2 := {From, To, Dept, Arrv, FlDays} 

The attributes “> 20mm” , “> 25mm” , “> 30mm” indicate the connecting times 
at the corresponding airport and the attributes “> 3/cm” , “> 4fcm” , “> 5/tm” , “> 
12fcm”,“> 16fcm” indicate the distance from the airport to the corresponding 
city; furthermore, the attributes “From” resp. “To” apply to the pairs consist- 
ing of a flight number and a departure resp. arrival airport; analogously, the 
attributes “Dept” resp. “Arrv” are understood with respect to the departure 
resp. arrival times. The incidence relations Iq and I 2 of the formal contexts Kq 
and K 2 reflect the declared or obvious meaning of the attributes with respect to 
the given objects. 



3 Conceptual and Relational Scales 

In this section we discuss how a contextual-logic extension of a TOSCANA- 
system may be developed on the basis of a power context family K := (Ko,Ki, 
...,K„) of formal contexts := {Gk, M^, Ik) (0 < k < n). It seems most 
natural that the extended TOSCANA program is designed for the purpose of 
combining several TOSCANA-systems, one for each formal context Kk; in this 
setting, the scales for Kq are called the conceptual scales, while the scales for 
Kk are called the k-relational scales {k = 1,. . . ,n). Of course, the TOSCANA- 
systems of the formal contexts of a power context family should not be inde- 
pendent because satisfactory information based on the data coded in the power 
context family has also to combine information out of different formal contexts 
of the family. This combination can be performed by the method of logical scal- 
ing, introduced in Formal Concept Analysis by S. Prediger (see [Pr97]). In the 
case of the power context family K, logical scaling is combining attributes out 
of Uj=o obtain new attributes for the formal contexts IK^,; as a necessary 

construction tool, we propose the term formation of the “Peircean Algebraic 
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Logic^’ (pal) which R. W. Burch created as “an attempt to amalgamate various 
systems of logic that Peirce developed over his long career” (see [Bu91]). 

According to PAL, the basic operations to combine attributes of Uj=o 
are the negation the permutations tt G Sj, and two so-called joins, ti and i2, 
extensionally defined for fc-ary attributes m and Lary attributes fh (k < 1) by 

(1) (-m)A := 

(2) (7rm)^'= := {(ff,r(i), ■ • ■ , ffTr(fc)) I (5i,---,fffe) G rn^^}, 

(3) := {{gi, . . .,gk- 2 ) \ ^gk-i^gk '■ (ffi, ■■■,gk) & and gk-i = gk}, 

(4) ((. 2 (to,to ))^'=+'-2 := {{gi,...,gk-i,g 2 ,---,gi) \ ^gk^gi ' {gi,...,gk) G 

and {gi,...,gi) G rh^‘ with gk = 5i}; 

furthermore, the /c-ary attributes J-k, k, and idk with {-i-k)^’‘ ■= 0, (Tfc)^'“ := 
Gk, and {idkY’^ := {(gi,...,gk) \ gi = ■ ■ ■ = gk & Go} are introduced as 
constants. In general, the compound attributes of the power context family K 
are derived from the unary attributes of Mq U Mi, the fc-ary attributes of Mk 
{2 < k < n), and the constants by recursively applying the negation -i, the per- 
mutations 7T, or the joins ti or L 2 - For suitably choosing compound attributes, it 
is useful to know further operations derivable from the basic ones, for instance 
(cf. [Bu91]): 



(5) 


(K*m)A+i 


■■= {{gi,--- 


g,_i,gi,gi,g^+i,...,gk) \ {gi,...,gk) G mAj, 


( 6 ) 


(<5*m)A-i 


■■= {(5i.--- 


gt-i,gi+i,...,gk) \ {gi,---,gk) g mAj, 


(7) 


(f^m)A-i 


:={(5i,... 


gj—i’ > gk) 1 






LU 


,...,gk)& mA and gi = gj}. 


( 8 ) (x(to,to))^*^+‘ 


:={(5i,... 


gk,gi,---,gi) \ 






{gi,---, 


gk) G mA and {gi, ...,gi)e mA}; 


(9) 


{rj{m, rh))^‘ 


:={(5i,... 


gi) 1 ( 51 . ■■■,gi) & rh^‘ with ( 51 , ...,gk)& mA} 



The so-called comma operation td “is a briliant device that Peirce was using with 
great skill and effect” ([Bu91], p.76), the operation 5® is Bernay’s “Streichung’^ 
operation, and the operations ^*-1 and x are called hook identity and product in 
[Bu91]. The insertion operation rj allows us to activate attributes of smaller arity 
in contexts with objects of greater arity; for instance, rj{m,T i) is a compound 
attribute of Ki with the extent {(gi, G G; | {gi, . . . , gk) G 

How the term formation by the basic operations of PAL and their derivatives 
may be used to obtain a TOSCANA-system for a power context family, shall 
be demonstrated by our example of the domestic flights in Austria. Examples 
of conceptual scales for the power context family described in Section 2 are 
given by the labelled line diagrams in Figure 2, 3, and 4 in [PW99]. Figure 
5 below shows a nested line diagram combining representations of two binary 
relational scales. The relational scale of the outer diagram is determined by the 
four binary attributes From, To, Dept, and Arrv, while the relational scale of the 
inner diagram has the binary compound attributes Graz, Innsbruck, Klagenfurt, 
Linz, Salzburg, Vienna, a.m., and p.m. which are formed from unary attributes 
by using the insertion operation together with T 2 and the conversion 7T2 (i.e. 
extensionally, the exchange of place 1 and 2 in an object pair). The comparison 
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Fig. 5. The concept lattice of the apposition of two binary relational scales 

of Figure 5 with Figure 2 makes clear the advantage of using relational scales. 
Of course, further relational attributes and scales (even of greater arity) are 
desirable; here we only mention the binary compound attribute From-To := 
i 2 {'^ 2 {From),To), the application of which will be shown in the next section. 

For using Peircean Algebraic Logic to create compound attributes and con- 
ceptual and relational scales based on a power context family, there is not only 
the argument that this logic combines well with the Contextual Logic as de- 
veloped until now, but also Burch’s thesis that “all procedures of relational 
constructions are formalizable in PAL” ([Bu91], p.l22). This thesis is analogous 
to Church’s thesis about computability, of course, less tested, but nevertheless 
convincing. 

4 Graphical Representations 

For a TOSCANA-system with a conceptual scheme containing conceptual and 
relational scales based on a power context family, there is still the question 
how to readably represent concept graphs derived from the conceptual scheme in 
connection with the actual database. Small concept graphs may be read off from 
labelled line diagrams of the concept lattices of appositions of relational scales, 
as the concept graphs about the flight connections between two airports from 
the nested line diagram in Figure 5. But a little bit larger concept graphs might 
already diminish the readability seriously. Then, Sowa’s conventions for drawing 
conceptual graphs [So84] would be better for the graphical representation of 
concept graphs; for our example, such representation is given by Figure 6 in 
[PW99] showing the flight connections of a commuter between Innsbruck and 
Vienna. 
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Fig. 6. Query graphs for retrieving flight information 

Sowa’s graph representations may also be used for formulating queries within 
a TOSCANA-system. This can be performed by choosing from presented line 
diagrams some conceptual and relational instances and by combining those in- 
stances to a graphically represented concept graph in which one or more refer- 
ences are replaced by question marks. The graphical representation of the query 
graphs is needed for checking them syntactically and semantically. Figure 6 shows 
some examples of querying flight information using concept graphs. In [GE99], 
it is discussed how those query graphs may be algorithmically tranformed into 
SQL statements for querying the database of the TOSCANA-system. 

Since the intention for developing TOSCANA was always to offer the user 
rich information in a transparent way so that she can make her decisions based on 
the presented information which is restricted as little as possible, a TOSCANA- 
system should allow general queries having as answer a wide “landscape” of 
detailed facts. For instance, a user, living in Vienna, might ask for the best 
flight connections at the weekend between Saturday 7 a.m. and Sunday 8 p.m. 
for visiting Craz, Innsbruck, and Salzburg where she has to meet colleagues to 
discuss important documents. The full information, under the given restrictions, 
is represented in the concept graph shown in Figure 7. Obviously, such a repre- 
sentation would be too complex for the customary user. Therefore, we propose 
another graphical representation, shown in Figure 8, which yields the same flight 
information as the graph in Figure 7. The easily understood conventions of the 
shown information map are the following: A straight line connecting two towns 
indicates that there are direct flights between those towns and an arrowhead on 
such a line points toward their destination; a small table linked to an arrowhead 
gives the information about the relevant flights, their departure and arrival times, 
and the week-days they operate. The used graphical means can be described by 
PAL terms as follows: the arrows represent to the objects of the extent of From- 
To, the link between an arrowhead and a flight number in the relevant table is 
characterized by the compound attribute ^^^(x(F'rom, To)), and the columns 
in the small tables correspond to the objects in the extent of the compound 
attribute r]{->f,^'^{->Sa,->Su),f,^‘^{x{^^^{x{Dept,Arrv)),FlDays)). The descrip- 
tions by PAL terms allow a contextual-logic management of the information 
maps so that they can be integrated in a TOSCANA-system. 











: 11.25 I --I WPORT: Salzburg I frt^m ) I TIME: 17.15 
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Fig. 7. Conceptual graph representing the weekend flights between Vienna, Graz, Innsbruck, and Salzburg 
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FLIGHT I 590 
TIME 10.25-11.20 
DAYS r^T 



FLIGHT 

TIME 

DAYS 



591 597 

17.15-18.10 19.05-20.00 

7 




Fig. 8. Information map with the same flight information as the graph in Figure 7 

An inspection of the information map immediately shows that tours via Graz 
to Innsbruck or Klagenfurt are impossible. Therefore, flights to Innsbruck and 
Salzburg with further connections have to be considered. Since there is only one 
flight from Salzburg to Graz late on Sunday, the best choices, giving reason- 
able periods of time for meeting colleagues, seem to be the flights 590 (Vienna- 
Salzburg), 2985 (Salzburg-Innsbruck), 1583 (Innsbruck-Graz) , and 549 (Graz- 
Vienna); but there is also another solution, namely the flights 070 (Vienna- 
Innsbruck), 2984 (Innsbruck-Salzburg), 597 (Salzburg-Vienna), 540 (Vienna- 
Graz), and 549 (Graz-Vienna). Now, the user has to decide, using further infor- 
mation and preferences, for instance, about staying over night in a hotel or at 
home and about the best possible times for the meetings. 

The example teaches us that the development of a TOSGANA-system has 
to use as much as possible the background knowledge of the potential users, 
especially, for obtaining a satisfactory human-machine interface. The tremendous 
increase of readability by changing from Figure 7 to 8 has its explanation in the 
common knowledge in our present culture, in particular, concerning the use of 
geographical maps and spatial reasoning. For instance, straight lines between 
towns with arrowheads on them may be intuitively understood as flight routes, 
and rows in a table, linked to a flight route, which have two numbers after 
the word “Time” are identified as the corresponding departure and arrival time 
without any difficulty (notice that the activated background knowledge has to 
be made explicit in Figure 7). Of course, the labelled line diagrams of concept 
lattices are less easily read and require some practice, but our experience is that 
customers, who are familar with the application domain and are interested in 
the presented information, understand labelled line diagram astonishingly well. 
Therefore, also in the future, TOSGANA-systems will use labelled line diagrams 
for presenting information, but will combine them with information maps and 
other graphical representations. 
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Abstract. With the aim of building a ’’Semantic Web”, the content of 
the documents must be explicitly represented through metadata in order 
to enable contents-guided search. Our approach is to exploit a standard 
language (RDF, recommended by W3C) for expressing such metadata 
and to interpret these metadata in conceptual graphs (CG) in order to 
exploit querying and inferencing capabilities enabled by CG formalism. 
The paper presents our mapping of RDF into CG and its interest in the 
context of the semantic Web. 



1 Introduction 

The Web is recognized as a fabulous information repository, with millions of 
heterogeneous information sources available throughout the world. But the ex- 
isting keyword-based search engines do not take into account the semantics of 
the documents accessible through the Web. The user can be easily overwhelmed 
by the huge number of answers (not always relevant) to a query. Therefore, the 
need of a ’’Semantic Web” is more and more emphasized [2,3]. The semantics 
of the documents must be explicitly represented through semantic metadata in 
order to enable semantic-contents-guided search [1]. Several proposals have been 
offered to this end, for example Ontobroker [8] and Shoe [12], that rely on exten- 
sions of HTML and exploitation of ontologies. In [9], the authors analyse several 
languages that may be used for representing metadata. They notice the follow- 
ing problems to be solved when dealing with large amounts of semi-structured 
information: searching information, extracting information, maintaining weakly 
structured sources, generating documents. They emphasize the importance of re- 
lying on standards that are widely accepted by the Web community. [9] stresses 
that knowledge representation languages offered in artificial intelligence (AI) 
seem attractive for this aim, but suffer from the lack of wide acceptance. We 
are convinced of the interest of AI representation languages that enable not only 
the representation of metadata but also support inferences on them. Among 
such AI knowledge representation formalisms, [13,14] stress the advantages of 
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conceptual graph (CG) formalism for expressing metadata. Another approach 
is to exploit a standard language for expressing metadata and to be able to in- 
terpret these metadata in conceptual graphs in order to exploit querying and 
inferencing capabilities enabled by conceptual graph formalism. 

RDF (Resource Description Framework) has been introduced and recom- 
mended by the World Wide Web Consortium (W3C) to enable descriptions of 
semantic metadata for the Semantic Web. RDF enables the addition of seman- 
tic information to a Web document, without making any assumptions about the 
structure of this document. The future will reveal whether RDF will be accepted 
as a standard for content descriptions of Web resources and whether it will be 
widely used by the authors of Web documents. In the event RDF is broadly ac- 
cepted, the approach of automatically interpreting metadata expressed in RDF 
into conceptual graphs seems interesting: it will enable us both to rely on a 
standard and draw benefit from the advantages of conceptual graphs. 

The purpose of the paper is to show that conceptual graphs can be used 
as a means to exploit RDF metadata to handle metadata-based search queries. 
After a description of the principles of RDF, we will detail the mapping of RDF 
and CG. Then, we discuss the interest of this approach in comparison to related 
work. 



2 RDF 

RDF is based on an underlying model with triples made of resource, property, 
and value. 

— A resource is an entity accessible by an URI on the Web (e.g. an HTML or 
XML document). Resources are the elements described by RDF statements. 

— A property defines a binary relation between resources and/or atomic val- 
ues. A property enables us to attach information to resources, and provide 
descriptions for resources. 

— A value can be either a simple character string or a resource. Reification 
enables one to transform a triple into a resource. The notion of collections 
permits us to define groups to which some properties are applied. 

An RDF statement specifies a value for a property of a resource. 

RDF has an XML syntax and can be seen as an object-oriented formalism 
for metadata statements. These metadata can rely on common ontologies repre- 
sented using RDF Schema (RDFS). 

RDF statements can be considered as triples (resource, property, value). The 
vocabulary used in these triples can be defined using RDFS, by a hierarchy of 
classes and a hierarchy of properties. 

Contrary to object-oriented or frame-based representations, RDF relies on a 
property-centric approach. Anyone can define properties about Web resources, 
in order to offer descriptions for these resources. In RDFS, properties are defined 
globally and not encapsulated in class definitions. They can be specialized using 
the subPropertyOf relationship. RDF/S offers three core classes: 
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— Resource (i.e. class of all objects), 

— Property (i.e. class of all properties), 

— Class (i.e. class of all classes). 

Two core properties are provided: type and subClassOf. The classes can be 
specialized through the subClassOf relationship. A resource is said to be an 
instance of a given class by means of the type property. The range and domain 
core properties are used to define the range (resp. domain) of properties. 

3 Mapping RDF to CG 

The model of CG formalism [17,6] is based on (1) a support made of a concept 
type lattice and of a relation type set possibly organized in hierarchy, a set of 
individual markers enabling the designation of instances, a conformity relation 
between markers and types, and (2) a base of conceptual graphs built on this 
support. 

It therefore seems natural to translate a) the RDF statements into a base of 
CG-facts b) the hierarchy of classes appearing in an RDF schema into a concept 
type hierarchy in CG, and c) the hierarchy of properties appearing in a RDF 
schema into a relation type hierarchy in CG. Therefore we will rely on a CG 
model enabling us to build a relation type hierarchy. 

3.1 Mapping of Basic RDF 

A basic RDF statement says something like : ’the author of the resource found 

at http://www.bookstore.org/idl971 is John Rawls’. It can be stated as a 
triple by this way : 

author (http://www.bookstore.org/idl971, ’John Rawls’) 

Several statements can be written about the same resource, for example : 

title (http://www.bookstore.org/idl971, ’A Theory of Justice’) 
date (http : //www. bookstore . org/idl971 , ’ 1971 ’ ) 

Written with the RDF /XML syntax : 

<rdf : Description about= ’http : //www. bookstore . org/ idl971 ’ > 
<author>John Rawls</author> 

<title>A theory of Justice</title> 

<date>197K/date> 

</rdf : Description> 

This can be interpreted in CG as : 

[Resource : http://www.bookstore.org/idl971] - { 

-> (author) -> [Literal : John Rawls] 

-> (title) -> [Literal : A theory of Justice] 

-> (date) -> [Literal : 1971]} 
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The principle of the mapping relies on considering an RDF description as 
an instance of a Resource CG concept type and the associated properties as 
relations of this concept. The designator of a Resource concept is the URI of the 
resource itself. 

3.2 Mapping of Nested RDF Descriptions 

The value of a basic RDF triple can also be an RDF description. For example, we 
can express that the value of the bio property is itself a description of another 
resource : 

<rdf : Description about= ’http : //www. bookstore . org/ idl971 ’ > 

<title>A theory of Justice</title> 

<author>John Rawls</author> 

<bio> 

<rdf : Description about= ’ http ; //www . bookstore . org/ John . Rawls ’ > 
<position>Philosopher</position> 

</rdf :Description> 

</bio> 

</rdf : Description> 

This RDF description will be translated into the following CG: 

[Resource : http://www.bookstore.org/idl971] - { 

-> (title) -> [Literal : A theory of Justice] 

-> (author) -> [Literal : John Rawls] 

-> (bio) -> [Resource : http://www.bookstore.org/John.Rawls] -> 
(position) -> [Literal : Philosopher]} 

In case of nested RDF descriptions, the mapping consists of creating a Re- 
source-typed concept for each nested resource description. Each nested resource 
concept is then linked to its embedded resource via a relation. For example, in the 
previous example, the nested resource http : / /www . bookstore . org/ John . Rawls 
is linked by means of a bio relation to the embedding resource 
http : //www . bookstore . org/ idl971. 

3.3 RDF Schema 

RDF descriptions can be typed according to a predefined ontology called a RDF 
schema. Thus RDFS formalism enables provision of a vocabulary used for the 
RDF annotations. For example, the previous description can be typed as a de- 
scription of a book, the class Book being itself defined in an RDF schema. 

<rdf : RDF xmlns : rdf = ’ http : //www . w3 . org/ 1999/ 02/22-rdf -syntax-ns # ’ 
xmlns : ns= ’ http : / /www . inria . f r/ acacia/ ices# ’ > 

<ns : Book rdf : about= ’ http : / /www . bookstore . org/ idl97 1 ’ > 

<ns : author>John Rawls</ns ; author> 
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<ns:title>A theory of Justice</ns :title> 

</ns :Book> 

</rdf : RDF> 

The leading RDF markup with a xmlns : rdf attribute defines the RDF 
namespace rdf as a shortcut for the RDF URI, namely : http://www.w3.org/- 
1999/02/22-rdf-syntax-ns#. Each RDF markup must be prefixed with this 
namespace in order to identify what is RDF-related and what is application- 
specific. An XML namespace is also associated with the application schema, 
(say ns), in order to identify the markup correctly. 

The corresponding CG is the following: 

[Book : http://www.bookstore.org/idl971] - { 

-> (author) -> [Literal : John Rawls] 

-> (title) -> [Literal : A theory of Justice]} 

Accordingly, when the RDF description is typed by a specific RDF Schema 
class, a concept of the corresponding type is created. In the example, the Re- 
source is of type Book, hence a Book concept is created. 

Concept and relation type names must be prefixed in a unique way, with a 
URI, to prevent name clashes from different schemas. For example, the concept 
type Book should be named : http://www. inria.fr/acacia/iccs#Book and 
the relation type author : http : //www. inria. f r/acacia/iccs#author. For 
the sake of readability, we skip these prefixes for CG in the paper, but they are 
mandatory in the implementation. 



4 Mapping RDF Schema 

The RDF Schema, according to the current W3C Candidate Recommendation 
from March 2000 [5], allows the definition of classes and properties. Classes and 
properties can be refined in subclasses and subproperties. In RDF, properties 
are first class objects which exist by themselves. Hence, new properties can be 
added to existing classes in order to enable reuse of these classes. 

4.1 Mapping of Classes 

RDFS classes can be modelled as CG concept types. In order to map the Re- 
source core class of RDFS, we introduce a Resource concept type at the top level 
of the CG concept type hierarchy. 

concept type Resource 

A RDFS class without any superclass explicitly indicated will be modelled 
by a subtype of Resource in the CG concept type hierarchy. 



<rdfs: Class rdf : 1D= ’ Cl ’ /> 
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This can be translated into, 

concept type Cl < Resource 
For example : 

<rdfs; Class rdf : 1D= ’Book’ /> 
can be modelled as : 

concept type Book < Resource 

4.2 Mapping Subclasses 

The subClassOf relation between classes in RDFS corresponds to the subtype 
relation (denoted j) between concept types in CG formalism. If an RDFS class 
C12 is defined as a subclass of Cll, it will be modelled by a C12 concept type, 
subtype of the Cll concept type in the CG concept type hierarchy. 

<rdfs: Class rdf : 1D=’C12’> 

<rdf s : subClassOf rdf :resource=’#Cll’/> 

</rdf s : Class> 

This can be translated into : 

concept type C12 < Cll 

The Novel subclass of Book : 

<rdfs: Class rdf : 1D=’ Novel ’> 

<rdfs : subClassOf rdf :resource=’#Book’/> 

</rdf s : Class> 

can be modelled as a subtype of Book: 
concept type Novel < Book 

4.3 Mapping of Properties 

A property is defined according to a domain (i.e. a class) and has an associated 
range that can be a literal or a class. For example, the title property can be 
defined with ’Book’ domain and ’Literal’ range. ’Literal’ is prefixed with the 
RDFS namespace. 

<rdf : Property 1D=’ title ’> 

<rdf s : domain rdf :resource=’#Book’/> 

<rdf s : range rdf : resource= ’ http : / / www . w3 . org/TR/ 1999/PR-rdf - 
schema-19990303#Literal ’ /> 

</rdf : Property> 

A property definition can be modelled as a CG binary relation type with 
an associated signature that maps the domain and range to the related concept 
types. For instance, the previous example is translated into: 

relation type title (Book, Literal) 
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4.4 Mapping of SubProperties 

In RDFS, a subproperty can refine an existing property, in the same way as a 
subclass refines a class. It means that if p2 is a subproperty of pi, then : 

p2(URI, v) => pKURI, v) 

that is, if a URI has value v for property p2, then it also has v as value for 
property pi. 

For example, we define here the author property and its jointAuthor sub- 
property : 

<rdf : Property ID=’ author ’> 

<rdfs : domain rdf :resource=’#Book’/> 

<rdf s : range rdf : resource= ’ http : //www . w3 . org/TR/ 1999/PR-rdf - 
schema-19990303#Literal ’ /> 

</rdf : Property> 

<rdf : Property ID=’ jointAuthor ’> 

<rdf s : subPropertyOf rdf :resource=’#author V> 

</rdf : Property> 

In terms of CGs, an RDF subproperty is translated into a relation type that 
is a subtype of the relation type translating the superproperty. In the example 
above, the jointAuthor relation type will be defined as a subtype of the author 
relation type. 

relation type author (Book, Literal) 
relation type jointAuthor < author 



4.5 Remark: Limits of this Mapping 

Let us notice a limitation in the mapping from RDF to CG : 

In the RDF Schema, a property can be defined with several classes as domain. 
However, this is not possible directly in GG because a relation type has only one 
signature. If this happens, three possibilities are offered to solve the problem : 

1. If the domain classes have a common RDF Schema class ancestor, set the 
relation domain to that ancestor, 

2. or else, define an abstract common superclass to all domain classes of the 
property and assign this new class as the domain of the relation signature. 
This is only possible if one masters the RDF Schema(s). 

3. otherwise, set the domain of the property to Resource, the top level GG 
concept type that matches all resources. The property will then be allowed 
for all resources. 
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For example : 

<rdf : Property ID=’ title ’> 

<rdf s : domain rdf : resource= ’ #Book ’ / > 

<rdf s : domain rdf : resource= ’ #Show ’ / > 

<rdf s : range rdf : resource= ’ http : //www . w3 . org/TR/ 1999/PR-rdf - 
schema-19990303#Literal ’ /> 

</rdf : Property> 

Solution 2 would consist of defining an abstract class, called ’Work’ for ex- 
ample, having ’Book’ and ’Show’ as subclasses and, then, define the title relation 
type with the following signature : 

concept type Work 
concept type Book < Work 
concept type Show < Work 

relation type title (Work, Literal) 

Otherwise, solution 3 would consist of defining title as : 

relation type title (Resource, Literal) 



4.6 Implementing the RDF Schema Metamodel 

The RDF metamodel is itself described in RDF, thus enabling the extension of 
the metamodel, by refining the predefined classes and properties. 

For example, in RDF Schema, it is possible to define a ’Concept’ metaclass 
that refines the ’Class’ metaclass, and then define a schema in terms of ’Concept’ 
instead of ’Class’ : 

<rdfs:Class rdf : 1D=" Concept "> 

<rdf s ; subClassDf ="http : / / www . w3 . org/TR/ 1999/PR-rdf - 
schema- 19990303#Class/> 

</rdf s : Class> 

<ns: Concept rdf : 1D=" Paper "/> 

We have implemented the whole RDF and RDF Schema metamodel in the 
CG support and enable metamodel extension. 



5 More Advanced Features 

5.1 Reification 

A reified statement is an RDF statement about an RDF statement, as for ex- 
ample : 
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John K. Galbraith says : 

’The author of the resource http://www.bookstore.org/idl971 is 
John Rawls’. 

In RDF, the treatment of such a reified statement is somehow cumbersome 
and introduces specialized properties as shown below : 

<rdf :Description> 

<rdf : subject resource="http : //www . bookstore . org/ idl97 1 "/> 
<rdf :predicate resource="#author"/> 

<rdf : object>John Rawls</rdf :object> 

<rdf : type resource="http : //www . w3 . org/ 1999/02/22-rdf- 
syntax-ns#St atement " /> 

<attributedTo>John K. Galbraith</attributedTo> 

</rdf : Description> 

In CG, this can be done simply with a context named Statement : 

[Statement: [Book: http://www.bookstore.org/idl971] -> (author) -> 
[Literal : John Rawls] 

] -> (attributedTo) -> [Literal : John K. Galbraith] 

Reified RDF statements, i.e. : descriptions that are instances of the Statement 
class, are translated into GG by means of a predefined context named Statement. 
The RDF description itself is translated into GG following the standard mapping. 

5.2 Mapping of Containers 

The model of basic RDF relies on triples with single values. In the case where 
the value of a property is a set of values, the W3G recommendation defines 
containers such as bags, sequences and alternatives in order to hold such values. 
For example here, the authors are given as a set of names : 

<rdf : Paper rdf :about=’ http: //www. inria.fr/acacia/iccs2000’> 

<ns : authors> 

<rdf : Bag> 

<rdf : li>C\ ’ edric H\ ’ ebert</rdf : li> 

<rdf : li>01ivier Corby</rdf : li> 

<rdf:li>Rose Dieng</rdf : li> 

</rdf :Bag> 

</ns : authors> 

</rdf : Paper> 

Gontainers can be handled by an adequate Bag concept which is related to 
its members by means of a member relation, called rdf :li : 

[Paper : http : //www. inria.fr/ acacia/ iccs2000] 

-> (authors) -> [Bag]-{ 
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-> (rdf:li) -> [Literal: CX’edric H\’ebert] 

-> (rdf:li) -> [Literal: Olivier Corby] 

-> (rdf:li) -> [Literal: Rose Dieng]} 

We introduce in the CG concept type hierarchy an abstract ’Container’ con- 
cept type and three subtypes for the concrete containers (see the appendix on 
types at the end of the paper). We also introduce a rdf :li relation type : 

relation type rdf:li (Container, Resource) 

5.3 Mapping Containers Having aboutEach Statements 

RDF offers a means to factorize statements that apply to all members of a 
container. This is done with an aboutEach statement. In the example below, the 
bag is given the auth ID, and it is said that all members of the auth bag have 
INRIA as institute : 

<rdf : Paper rdf :about=’ http://www.inria.fr/acacia/iccs2000’> 

<ns : authors> 

<rdf:Bag ID=’auth’> 

<rdf : li>C\ ’ edric H\ ’ ebert</rdf : li> 

<rdf : li>01ivier Corby</rdf : li> 

<rdf:li>Rose Dieng</rdf : li> 

</rdf :Bag> 

</ns : authors> 

</rdf : Paper> 

<rdf : Description aboutEach= ’ #auth ’ > 

<ns : inst>INRIA</ns : inst> 

</rdf : Description> 

We can translate on the fly to distribute the factorized value to all bag 
members : 

[Paper : http : //www . inria . f r/ acacia/iccs2000] 

-> (authors) -> [Bag] -{ 

-> (rdf:li) -> [Literal : C\ ’ edric H\’ebert] 

-> (inst) -> [Literal : INRIA] 

-> (rdf:li) -> [Literal : Olivier Corby] 

-> (inst) -> [Literal : INRIA] 

-> (rdf:li) -> [Literal : Rose Dieng] 

-> (inst) -> [Literal : INRIA] } 



6 Querying 

The main interest of mapping RDF to CG is the adequacy between the two 
models, i.e. concepts and relations smoothly map onto classes and properties 
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that are defined independently in CG as well as in RDF. Furthermore, it enables 
us to use RDF without any knowledge of Conceptual Graphs. 

The second reason is the relevance of the CG projection operation to querying 
a RDF/CG base. Querying RDF metadata consists of retrieving RDF triples 
belonging to classes, taking specialization into account. This can be done through 
the projection operation. 

Furthermore, thanks to the implementation platform that we have chosen, 
namely F. Southey’s Notio [16], it is possible to parametrize precisely the graph 
matching process. Hence, it is possible to tune concept matching, including type 
and instance matching. For example, concepts may match according to (at least) 
one of the four conditions on concept types : 

— first type is a supertype of the second 

— first type is a subtype of the second 

— first type is either a subtype or a supertype of the second 

— concepts have same type. 

Relation matching can also be parametrized, as well as other aspects of the 
graph matching. This functionality is well adapted to metadata information 
retrieval as it authorizes approximate matching along specialization and gener- 
alization and on relations and concepts. 

In the current prototype, the query language is RDF itself. The user describes 
a partial RDF statement that he is looking for. The RDF query may hold vari- 
ables, prefixed by to indicate the parts that are unknown, the value of which 
should be returned by the query processor. 

For example, let’s look for books the author of which is John Rawls, and 
return their title : 

<ns:Book rdf : about="?l"> 

<ns : author>John Rawls</ns : author> 

<ns : title>?2</ns : title> 

</ns : Book> 

The RDF query is translated into the graph shown below : 

[Book] - { 

-> (author) -> [Literal : John Rawls] 

-> (title) -> [Literal]} 

The query processor projects the query graph on the GG base. The resulting 
(sub)graphs are translated back into RDF in order to be presented to the user 
in a uniform way. 

The prototype also implements approximate search on literal values, thanks 
to the Notio matching scheme that enables us to attach customized match 
comparators to markers. We implemented such an approximate comparator that 
tests whether the query literal value is included into the graph literal value. 
Approximate query values are prefixed by the ’~’ character. 

It is then possible to send a query that searches an author, the value of which 
contains ’Rawls’, as shown below. 
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<ns:Book rdf :about="?l"> 

<ns : author>~Rawls</ns : author> 

<ns : title>?2</ns : title> 

</ns : Book> 

If several properties implement the author relationship, e.g. author, joint au- 
thors, etc. the problem may arise of taking all of them into account for query 
processing. In fact, RDF enables the association of a common external label to all 
these properties, say author. An advanced query GUI would be able to propose 
the abstract author property to the user and translate it to several CGs accord- 
ing to the signatures of the target author relation signatures (container, literal, 
etc.). In any case, this problem is relative to RDF, not to the GG translation 
model. 



7 Implementation 

We implemented a prototype using the Notio GG platform [16] and the VRP 
RDF parser from IGS Forth [19]. We exploited the possibility offered by Notio 
for parametrizing the projection (cf. generalization, specialization). Our proto- 
type can translate classes and properties of RDF Schema and RDF statements, 
except the aboutEachPrefix RDF statement that is not presented here. 

The translation of the RDF metadata into a base of GG-facts can be done 
automatically thanks to our prototype. Several ways of integrating such an au- 
tomatic translator RDF-^GG in a Web search engine can be thought out : 

— A robot could access the Web documents and build a base of GG-facts cor- 
responding to the translation into GG of their RDF metadata. The link 
between the GG-facts associated to a document and the document would 
be kept by the robot. This would be done before any requests from a user. 
Then, when a user makes a request to search a given document, this request 
would be translated into a GG-query. The parameters of the user’s request 
(in particular, if the user wants to obtain approximate answers by enabling 
generalization or specialization) can also be exploited to parametrize the 
projection to be used. The results of the projection of the GG-query on the 
base of GG-facts constitute the answers to the user’s request. 

— Another possibility would be to let the search engine use the translator to 
build the GG dynamically, only after a request by the user. 

The prototype currently runs as a Java servlet, accessed by means of a stan- 
dard Internet navigator at a given URL The user can type a query and send it 
to the system which performs the projection and sends back the answer in RDF. 
The resulting RDF statements are displayed by means of an XSLT stylesheet, 
thanks to James Glark’s XT engine [7]. The URI contained in the resulting RDF 
statements are transformed into active HTML links on which the user may click. 
Hence, the whole process implements a conceptual search engine. 
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8 Conclusion and Discussion 

Our approach delivers a more powerful and relevant search with the CG projec- 
tion. In particular, the parametrization of the projection enables several levels 
of search. 

Our approach takes advantage of the CG formalism, as [13,14], but without 
requiring the author of the document to know CG. The interest of our approach 
is that if RDF, recommended by W3C, is widely adopted as a standard by the 
Web community, then a Web document author can both continue to use RDF 
annotations and draw benefit from the CG formalism, even without knowing 
himself the CG formalism. 

The exploitation of RDF schemas by means of Conceptual Graphs seems 
more relevant in the context of a company or of a given community: this company 
or community can agree on the conceptual vocabulary used for expressing the 
metadata about their documents. 

In the future, we plan to study a query language for RDF statements and 
the mapping to appropriate CG projections. 
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Appendix : The Type System 

This appendix presents (part of) the type system as it is implemented in the 
prototype. We also present (part of) the metamodel as it is implemented. These 
meta types are not intended to be instantiated in conceptual graphs, but only 
to be specialized in meta schema. 

rdf = ’ http : // www . w3 . org/ 1999/ 02/22-rdf -syntax-ns# ’ 

rdf s= ’ http : //www . w3 . org/TR/ 1999/PR-rdf -schema- 19990303# ’ 

concept type rdf : Thing 

concept type rdfs: Literal < rdf: Thing 

concept type rdfs :Resource < rdf:Thing 

concept type rdf s : Container < rdf s : Resource 

concept type rdf: Bag < rdf s : Container 

concept type rdf: Alt < rdf s : Container 

concept type rdf:Seq < rdf s : Container 

concept type rdf : Statement < rdf s : Resource 

relation type rdf : Property (rdfs: Resource, rdf: Thing) 

relation type rdf :li (rdfs: Container, rdf: Thing) < rdf: Property 

concept type rdfs: Class < rdfs : Resource 
concept type rdf: Property < rdfs : Resource 

rel.type rdf :type(rdfs:Resource,rdfs:Class) < rdf:Property 
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rel.type rdf s : subClassOf (rdf s : Class , rdf s : Class) < rdf:Property 
rel . type rdf s : subPropertyOf (rdf : Property , 

rdf : Property) <rdf : Property 

rel . type rdf s : ConstProperty (rdf s : Resource , rdf : Thing) <rdf : Property 
rel.type rdf s : domain (rdf :Property,rdfs:Class)< rdf s : ConstProperty 
rel.type rdf s : range (rdf : Property , rdf : Thing) < rdf s : ConstProperty 
rel.type rdf : object (rdf : Statement, rdf : Thing) < rdf: Property 

rel.type rdf : subject (rdf : Statement, rdf s: Resource) < rdf: Property 
rel.type rdf : predicate (rdf : Statement, rdf : Property) < rdf: Property 
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Abstract. We give an overview over the computational tools for concep- 
tional structures that have emerged from the theory of Formal Concept 
Analysis, with emphasis on basic ideas rather than technical details. We 
describe what we mean by conceptual computations, and try to convice 
the reader that an elaborate formalization is a necessary precondition. 
Claiming that Formal Concept Analysis provides such a formal back- 
ground, we present as examples two well known algorithms in very sim- 
ple pseudo code. These can be used for navigating in a lattice, thereby 
supporting some prototypical tasks of conceptual computation. We refer 
to some of the many more advanced methods, discuss how to compute 
with limited precision and explain why in the case of incomplete knowl- 
edge the conceptual approach is more efficient than a combinatorial one. 
Utilizing this efficiency requires skillful use of the formalism. We present 
two results that lead in this direction. 



1 Introduction: Conceptual vs. Computational 

The basic meaning of “to compute” is to perform arithmetic operations on num- 
bers in order to obtain a numerical result. To solve a difficult problem, highly 
complex computations may be necessary, but fortunately, computations can be 
done efficiently and with high precision. They lead to reliable results, at least 
when they are carried out by a computer, i.e., by computing machinery. Human 
beings tend to count computations as their less desirable activities. As units of 
(human) thought we prefer concepts over numbers, grouping objects according 
to their common attributes. Typical operations of conceptual knowledge pro- 
cessing are generalization and subsumption, judgement and conclusion, rather 
than addition and multiplication. 

Numerical computations are useful for certain purposes, in particular for de- 
scribing the physical word in terms of counting and measuring. When they are 
appropriate, their power often goes far beyond that of merely verbal argumenta- 
tion. It is common belief that computations are objective, and that conceptual 
conclusions are not. Most objects of our thought, however, cannot naturally be 
expressed by numbers, and therefore numerical computations are of very limited 
use. A good indication that it is not straightforward to use numerical methods 
for conceptual purposes is the example of computers themselves: their compu- 
tational power and accuracy stands in sharp contrast to the incomprehensibility 
and untrustworthiness of their operating instructions and manuals. 
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There were many attempts to combine the advantages of computation with 
those of other knowledge processing techniques, from Aristotelian logic to Logic 
Programming. Logical calculi and algorithms were invented that support auto- 
mated reasoning, but also numerical descriptions of concepts, based on similar- 
ity and dimensionality, are widely used. Yet these approaches have not led to 
a satisfactory “mathematics of concepts”. In fact, there are warnings that such 
attempts may be in vain: 

Mathematicians wish to treat matters of perception mathematically, and 
make themselves ridiculous . . . the mind . . . does it tacitly, naturally, and 
without technical rules. 

Blaise Pascal (1670), cited from [3]. 

In fact, mental processes cannot be reduced to simple operations similar the the 
operations of arithmetic. The power of modern computing machinery has not 
been sufficient to fully emulate the mental abilities even of a snail or a butterfly. 
We have learnt to be modest when exploring the “technical rules” of the brain. 

Nevertheless, there has been some progress since B. Pascal. We have a bet- 
ter understanding of the principles of mental operations, and we can formulate 
methods that perform tasks similar to mental information processing. Modern 
mathematical logic offers sophisticated and efficient algorithms. But its notions 
are too far from everyday language. There certainly still is demand for con- 
ceptual computation. John Sowa has proposed to use Conceptual Graphs as “a 
system of logic based on the existential graphs of Charles Sanders Peirce and the 
semantic networks of artificial intelligence. They express meaning in a form that 
is logically precise, humanly readable, and computationally tractable”. To work 
out the mathematical foundations that are necessary to make such computa- 
tions work is, however, quite a task. It is advisable to look for ties to existing 
mathematical theories that could be utilized. 

About a century ago, mathematicians became aware that computation is not 
restricted to numbers: computing with functions, matrices, polynomials, sym- 
metries and the like had become a matter of course. A field of mathematical 
research devoted to studying the meta rules of computation was established in 
the 1930s under the name of Universal Algebra (Birkhoff [1], see also Gratzer 
[11]). It provides a vast repertoire of methods. A general algebraic structure in 
the sense of Universal Algebra is made up from arbitrary elements (replacing 
numbers as units of computation) and of operations (that take these elements 
as input and produce as output other elements, only depending on the input). 
In the setting of conceptual knowledge processing, the elements may be the con- 
cepts and the operations can be, as indicated above, common generalization and 
specialization, but possibly also others like negation or opposition. An algebra 
of concepts along these lines was worked out by Rudolf Wille and his group un- 
der the name of Formal Goncept Analysis [9]. Based on mathematical Lattice 
Theory, it meanwhile has become a rather compact and stable theory itself. It 
can, as Wille [21] has demonstrated, be linked with Gonceptual Graphs, thereby 
leading to a multitude of computational possibilities for these. 
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2 Formalization Precedes Computation 

Do we need a mathematical theory in order to compute? Children can do with- 
out, can’t they? Above, we have mentioned some natural operations for compu- 
tations with concepts, such as common generalization and specialization. One 
could be tempted to use these operations right away and find out later, by doing, 
about the rules that govern such computations. 

We predict that such an approach will not be successful. Consider as a simple 
example the sentence 

The least common superconcept of cat and dog is pet. 

This seems do be an instance of a conceptual operation as described earlier. In- 
put are the concepts cat and dog, output is the concept pet. But such a statement 
cannot remain undisputed, and therefore it does not have the necessary objec- 
tivity. It is crucial for an algebraic operation that its results are independent of 
the computational process. As soon as there is the slightest doubt, computations 
loose their value. 

We might argue that the cat-dog statement is false (because e.g. carnivore 
denotes a superconcept of cat and dog, but not of pet, and because tigers are 
cats, but no pets). That would even be too optimistic because it suggests that 
there might be a correct result. But this is not the case. There is no objective 
least superconcept of cat and dog, moreover, there is no objective interpretation 
of cat and dog either^. The meanings of these words depend on context, and so 
do conceptual operations. 

This is a serious obstacle for computation. We are forced to consider con- 
text when computing, and, in particular, must precisely disclose the contextual 
influence in a manner that can be algorithmically encoded. An undispensable 
precondition for successful conceptual computation therefore is a suitable for- 
malization of context and context-dependent concept operations. 

One approach to do this is to use so-called conceptual ontologies and type 
hierarchies. A much more general and mathematically simpler way is gone in 
Formal Concept Analysis. There, a formal context is mathematically defined to 
be a triple (G, M, I) consisting of two sets G, M and a binary relation I C GxM. 
The elements of G and M are usually interpreted as objects and attributes, with 
the relation I expressing the incidence between them, but note that the sets G 
and M may be arbitrary. All conceptual operations are performed with reference 
to such formal contexts. It may seem that such a data type is too simple, and 
in fact, more complex context representations are also used in Formal Concept 
Analysis. But it can be demonstrated that all these can be expressed in and 
represented by one or several such formal contexts {G,M,I), by means of a 
translation process called scaling. 

^ This seems to be just splitting hairs. A cat is a cat, isn’t it? Let me reformulate: It 
is not obvious at all how to make the notion of the set of all cats sufficiently precise. 
For example, does it depend on time? We can easily agree on one definition, but 
the word “cat” by itself does not give a precise definition. In order to apply formal 
methods effectively, this level of precision however is indispensable. 




486 



Bernhard Ganter 



It was again Rudolf Wille [21] who has demonstrated that this machinery can 
also be used to introduce a mathematical representation of conceptual graphs 
which connects these with the broad algebraic theory behind Formal Concept 
Analysis. This provides many possibilities for new algorithms. Such results will 
be presented elsewhere, here we shall give some impressions of the computational 
methods used by Formal Concept Analysis itself. 

3 Formal Concepts 

The basic mathematical theory of Formal Concept Analysis has been presented 
on many occasions, see [9] for a monography. Let us shortly recall that for any 
given formal context (G, M, I) we can define two operators that map subsets of 
G to subsets of M and conversely, for A C G, B C M given by 

A I— int(A) := {m G M \ gim for all g G A}, 

B I— >■ ext(R) := {g G G \ gIm for all m G B}, 

and that a formal concept of (G,M,I) is defined as a pair (A,B) of sets sat- 
isfying A C G, B C M, int(A) = B, and A = ext(i?). (We usually write 
A', B' instead of int(A), ext(R), but such a notation could be confusing in 
the program code below). The set A = extint(i?) is called the extent of the 
concept (A,B), while B = intext(A) is the concept’s intent. Ordered by the 
“subconcept-superconcept relation” (Ai,Bi) < (A 2 ,B 2 ) ■ 4=^ Ai C A 2 , these 
formal concepts form a complete lattice, called the concept lattice of the for- 
mal context {G,M,I). This concept lattice is in fact a lattice in the sense of 
mathematical Lattice Theory (Birkhoff [1], see also Gr”atzer [10] for a recent 
monography). Moreover, every complete lattice is isomorphic to some concept 
lattice, which means that the mathematical methods of Formal Concept Analy- 
sis apply to each theory using lattices. The lattice operations join and meet (or, 
synonymously, supremum and infimum), can be interpreted on formal concepts 
as least common superconcept and greatest common subconcept. 

Using lattices is not the only natural algebraization of conceptual opera- 
tions. For example, Lehmann and Wille [16] have presented a family of algebraic 
structures, called trilattices, that can be used to describe the triadic approach 
by C.S. Peirce to concepts. Wille has shown that these structures are useful for 
the algebraic description of nested conceptual graphs. But trilattices are much 
less understood today than lattices. 

Lattices are not only well investigated in their algebraic structure, they also 
admit a simple and appealing graphical representation (the lattice diagram). 
Interpreted as concept lattices, these diagrams have proven to support com- 
munication about conceptual data. The commercial TOSCANA-software^ uses 
such diagrams for a better access to relational data bases. 

Mentioning TOSCANA, would should point out that there are several exten- 
sive implementations of lattice algorithms, both related to Formal Concept Anal- 
ysis (e.g., Burmeister [2], Vogt [20], Lindig [17]) and independent (e.g., Freese [6], 
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Doignon and Falmagne [4]). Many of these programs can generate lattices from 
data, prepare or compute graphical representation including labelings, test for 
structural properties like distributivity and such. Others can translate between 
data structures. Several implementations are devoted to exploration procedures 
as described below. 

4 Computing in a Concept Lattice 

It is a common misconception that in order to perform lattice computations, 
one should first generate and store the lattice itself. There are simple and fast 
algorithms for generating a concept lattice from a given formal context, and we 
shall give such an algorithm below, but usually it is not necessary to generate 
all concepts. This is similar as with numeric computations: to perform addition 
or multiplication, it would be absurd to first generate all numbers that are po- 
tentially needed. It suffices to know that any number has a bit representation 
that describes its behaviour in algebraic operations. Similarly, a concept lattice 
that is used in practice may have billions of elements, or even be infinite, and 
yet every single algebraic operation may be performed without much effort. We 
shortly describe how this is done (in the finite case): 

The usual computer representation of a set S' is a bit vector, each bit rep- 
resenting an element of S. In fact, bit vectors of size |S| then can be used to 
represent all subsets of S. A 1 in position s indicates that the element s G S 
belongs to the respective subset, a 0 indicates that it does not. The set oper- 
ations n and U then correspond to the bitwise and and OR, and thus are at 
least as simple and fast as the elementary arithmetical operations. We shall use 
them in the sequel without further mention. A formal context (G, M, I) can be 
respresented by a G x M-bit matrix with an 1 in position (g, m) indicating that 
g and m stand in relation I . The gih. row of this matrix, considered as a bit 
vector of size |M|, represents the set 

g' := {m G M \ gim}, 

and similarly is the mth column a bit vector representation for m' := {g G G \ 
gIm}. Formal concepts were introduced as pairs (A, B) of sets, with A C G and 
B C M; they can thus be encoded as a pair of bit vectors of size |G| and |M|, 
respectively. Writing gcs for the greatest common subconcept and Ics for the 
least common superconcept, we find that these are easy to compute: 

gcs ((Ai,Bi), (A 2 ,B 2 )) = (Ai n A 2 , intext (Bi U B 2 )), 

Ics {{Ai,Bi), (A2, i?2)) = (ext int(Ai U A 2 ),Bi n B 2 ). 

The only non-obvious operations here are intext and extint. These are again 
easy. The program in Figure 1 computes, for a given set T C M , its closure 
Tie '■= int ext(T): 

We have mentioned that there are simple algorithms to compute all concepts 
of a formal context. One that has first been implemented some twenty years 
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BEGIN 

Tie := M; 

for g G G do if T G g' then Tie := Tie D (?'; 

END 

Fig. 1. Computing the closure T" := intext(T) of an arbitrary set T C 
M. For computing extint(T), the roles of G and M are interchanged. 



ago is so simple that it can be reprinted here, see Figure 2 (a more detailed 
description is given in [9]). It relies on the fact that the operator T i-T intext(T) 
mathematically is a closure operator. The program computes, for a given closure 
operator T i-T Clo(T) on a finite set M all closures, i.e., sets that are of the 
form Clo(X) for some X. 



procedure next_closure( var A : set; var found : boolean); 

BEGIN 

X ;= A\ found := false \ i := n\ 
while ( not found) and (i > 0) dp begin 
if i (f: X then begin 
A ~ Clo(X U{i}); 

found (X n {1, . . . , i — 1} = T n {1, . . . , i — 1}); 
end ; 

X := X \ {(}; i ~ pred (i); 
end ; 

END 

Fig. 2. Recursive algorithm to generate closed sets of a given closure oper- 
ator Clo on the set {1, 2, . . . , n}. To generate all closed sets, start with the 
closure of the empty set, A := Clo(0), and repeatedly call the procedure 
next .closure as long as found=tme . 



The complexity of such algorithms has been studied^, and it seems that this 
one still competes. But perhaps more interesting is its behaviour in practice. In 
a recent study, Lindig [17] has randomly generated formal contexts with 10,000- 
20,000 concepts. He found that his implementation of the algorithm needed about 
one second per thousand concepts; this is fast enough for most applications. Even 
the covering relation (see [8] for an algorithm) can be computed for lattices of this 
size in less than a minute. Since the algorithm is so simple, it can be modified in 

® See Kuznetsov and Objedkov [15] for a recent comparison. 
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many ways with little effort: it can easily be made incremental, can be restricted 
to sublattices and other subsets of interest, can be used “under automorphisms”, 
et cetera [9]. Kuznetsov [12], [13], [14] has shown that to determine the number 
of concepts of a given formal context is #P-complete. We should point out that 
there are many recent developments of new lattice generating algorithms, see 
e.g. Stumme et al. [19], and the literature cited there. 

We have already mentioned that computing a full concept lattices is rarely 
necessary and often not desirable. A concept lattice of empirical data may be 
very large, may be incompletely known, may vary with changing views (scalings) 
of the data. Computing with formal concepts is easy using their bit representa- 
tion. This leads to another key question: Why compute with concepts? What are 
the goals of conceptual computations? Again, a comparison with arithmetic is 
instructive: addition and multiplication of numbers are important because they 
enable us to solve tasks of practical value. To achieve solutions, certain math- 
ematical techniques or compound operations are used, like solving a system of 
equations. 

A list of ten prototypical tasks of conceptual knowledge processing was com- 
piled by Wille [22]. They are 

Exploring Searching Recognizing 

Identifying Analyzing Investigating Deciding 

Improving Restructuring Memorizing. 

In Wille’s paper, these ten tasks are described in detail and examples are given 
of how they can be attacked in conceptual structures. Other authors have sug- 
gested to use concept lattices for browsing, retrieval and for clustering, tasks that 
may be subsumed under those of Wille’s list. Many applications are inspired by 
the analogy to trees: lattices can be used, like trees, as search- and decision 
structures, but with additional features. One of the disadvantages of decision 
trees, for example, is that the decisions must be made in a fixed order. A single 
missing information is likely to make the decision procedure unsuccessful, even if 
the available information would suffice to determine a result. Lattices are more 
flexible. In fact, one way to introduce concept lattices is as decision structures 
where the decisions can be made in arbitrary order. An elementary technique 
that supports these purposes allows to move about in a lattice, from elements 
to neighbouring elements. This compares to the change directory-command, 
used to move through tree-structured file systems, except that the branching in 
a lattice is both upwards and downwards: a concept may have several subcon- 
cepts, but also several (immediate) superconcepts. It is easy to find all immediate 
subconcepts of a given concept. In Figure 3 we present an algorithm that was 
mentioned in [8] (but has been implemented much earlier) . When applied to the 
closure operator intext of a formal context (G,M,I), it determines all (intents 
of) lower neighbours of a given concept The same algorithm, applied 

to the ext int-operator on the set G, can determine the upper neighbours of any 
given concept. There is also a version that serves the same purpose as the latter, 
but operates on the set M, as the one in Figure 3. 
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BEGIN 

ni := 0 ; 

for m G M \ B do MAXGEN[m] := true ; 
for m G M\B do 
begin 

C ■- Clo(B U {m}) \ B- 

if MAXGEN[n]= false for all n G G \ {m} then 
begin 

ni := m + 1; 

LN[n,] := G; 

end else MAXGEN[m] := false ; 

end ; 



END 



Fig. 3. Program to compute those closures of a given closure operator 
Clo on the set M that cover a given closure B. ni is the number and 
LN[l..n(] is a list of all such neighbours. 



The algorithms in Figures 2 and 3 are extraordinarily simple (yet useful). 
Most of the mentioned program implementations are, of course, much more 
advanced and cannot be described in detail here. 



5 “Floating Point” Lattice Operations 

One of the successful features of arithmetic is to allow a precise amount of impre- 
cision, realized by truncating the decimal or binary representation of numbers. 
“Floating point” computations usually lead to results that differ only unessen- 
tially from the true values. They can, if necessary, be repeated with higher pre- 
cision. Moreover, truncating often reflects the fact that the data are not given 
with higher precision. A similar data simplification is used for high dimensional 
numerical data: the analysis is carried out only for the “dimensions of highest 
charge” . 

The use of bit representations for formal concepts makes it easy to define 
operations of restricted precision, “floating point” operations, for concept lat- 
tices, too. In fact, since the int ext-operator is local in the sense that there is no 
bit carry-over, there is no natural choice of most significant bits. It is a matter 
of the intended interpretation purpose how such a coarsening is made. Wille 
[22] has suggested to understand and use conceptual structures such as concept 
lattices as conceptual landscapes of knowledge. Scouring about such landscapes 
can be done with lattice operations of restricted precision with varying choices 
of significant bits. The most advanced implementation of this idea is the already 
mentioned TosCANA-software. 
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To give an idea of how such a coarsening may be useful in practice, let us 
allusively discuss a realistic scenario: Suppose we investigate data on the effects 
of a certain diet. The parameters given for each subject in the study may be 
numerical (age, height, initial and final weight, . . . ) and non-numerical (health 
status, sports activities, ...). For a rough conceptual analysis, the numerical 
values may be too precise and can be replaced by coarser qualitative ones (young, 
tall, overweight, . . . ). These new attributes must be defined in terms of the given 
parameters, but are not required to be mutually exclusive: They may have their 
own “conceptual ontology” . With such a coarsening, we may discover regularities 
of the data (“healthy normalweight subjects did not lose much weight”). Then 
these observations can be made more and more precise by focussing and refining 
the qualitative attributes, based on the given data. 

This is a general strategy, and it can be extended in many ways. It is, for 
example, possible to cover functional dependencies as well, see [9] for details. 
Notions of dimension can also be utilized for concept lattices, again in several 
ways. There is order-dimension and ordinal dimension, there are embeddings into 
direct products and tensor products, there are decompositions into overlapping 
“maps” like for an atlas. The mathematical and algorithmical preconditions for 
these notions are well understood (see [9]), the practical experiences are growing 
(see e.g. [17]). 

6 Reliable Computation with Incomplete Data 

For many applications it is unrealistic to assume that data are completely given. 
We have to make do with incomplete information. There are different approaches 
to this phenomenon, including the use of probabilistic methods and fuzzy set the- 
ory. It has been demonstrated (see e.g. Pollandt[18j) that these can be combined 
with concept lattice computations, but this will not be treated here. Instead, we 
discuss how to deal with incomplete but reliable data^. 

Assume that we study a data context (G, M, I) which is only very partially 
known. We concentrate on the attribute combinations that occur in {G,M,I). 
To get an idea of what is meant, think of a context where G is the set of all 
natural numbers, M is a, set of properties like “prime”, “sum of two squares”, 
“sum of two cubes” (of natural numbers), that these numbers may or may not 
have, and where / is the natural incidence. A typical question that could be 
asked is 

Is there a prime that is the sum of two cubes, but not the sum of two 
squares ? 

In general, such an elementary query can be described by a pair {S, T) of sets S 
and T of attributes representing the question 

Is there an object having all the attributes from S, but none of the at- 
tributes in T? 

^ See also the article by Burmeister and Holzer in this volume. 
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From observations or from other sources we may learn for some of such pairs 
(S', T) whether the answer is yes or no, and thereby collect partial information 
about the data context {G,M,I). 

Algorithmically, such information is difficult to handle. Even the most ele- 
mentary question if the answer to a query (S, T) can be inferred from earlier 
observations turns out to be an AfP-complete problem. This becomes apparent 
when translating the setting into that of Propositional Calculus (with the set 
M as the set of propositional variables). Each query {S,T) then corresponds 
to propositional clause f\S ^ \/ T, and the inference problem for such queries 
translates to the inference problem for clauses, a problem well known to be al- 
gorithmically intractable. Thus, reconstructing the combinatorial structure of a 
large formal context from elementary observations of this kind is a very time 
consuming task. 

The computational conditions improve drastically if, instead of asking for 
the combinatorial structure, we explore the conceptual structure and ask for the 
possible concept intents rather than the attribute combinations of single objects 
(concept intents are attribute combinations in common to sets of objects). It 
turns out that then atomic queries of the form (S', {t}) suffice to determine the 
full structure. Such queries correspond to propositional formulae /\ S — >■ t, known 
as propositional definite Horn formulae. In Formal Concept Analysis, implica- 
tions are used instead, corresponding to more general propositional formulae of 
the form /\ S — ^ f\T. Again, the complexity of the inference problem for such 
formulae is well known: it is essentially linear. Working with implications to 
explore the conceptual structure of a context is therefore much easier than the 
combinatorial approach. 

Skillful use of the mathematical theory in fact allows to combine combina- 
torial and conceptual information without abandoning computational efficiency. 
To give an impression of how such more advanced methods are introduced, we 
shall take, in the last section of this article, a glimpse at a recent theoretical 
result. It is the starting point of a practical conceptual knowledge acquistion 
method. 

This will, however, require a pinch of mathematical framework. 



7 Cumulated Clauses 

As was mentioned above, attribute combinations may be described using propo- 
sitional clauses, i.e., expressions of the form /\A — >■ \! B, where A and B are 
subsets of the attribute set M. We say that an arbitrary attribute set F C M 
is a model of /\ A — >■ Y H if and only if A<^F or Br\F^0.F is a, model of a 
set C of clauses if it is a model of each clause in C. It is well known that clauses 
suffice to describe all set families. More precisely, for each set T of subsets of 
M there ist some set C of clauses such that the elements of T are precisely the 
models of C. 
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A cumulated clause is an expression of the form 

V 

t&T 

where T is some index set and A and each Bt,t G T, are subsets of M. A set 
F C M is a, model of /\ A — >■ VteT A if and only if A ^ F or Bt C F for some 
t G T. Cumulated clauses generalize both ordinary clauses (where |i?t| = 1 for 
each t G T) and implications (where |T| = 1). We use them because they have the 
expressiveness of clauses and share some algorithmic properties of implications. 
In particular, every set T of subsets of M can be described by cumulated clauses. 
For example we may use for each subset ACM the cumulated clause ax defined 
as 

ax = A X -o-\j ^!\^F \ F G F minimal w.r.t. A C f| . 

Then C := {ax | A C M} is a set of clauses that describes F. Unfortunately, 
this set is of exponential size and therefore of little practical value. 

An interesting result now is that for each family of sets there is a natu- 
ral irredundant description with cumulated clauses. This is the key to a useful 
conceptual knowledge acquistion method. We present two recent mathematical 
results that make this method work. 

First we need the notion of a pseudo model. The definition is elementary, but 
somewhat bewildering, because it is a recursive definition that seems to lack a 
base case. A closer look tells that this is not so. 

Definition 1. Let F he a set of subsets of the finite set M . A set P C M is 
called a pseudo model of F if it satisfies the following two conditions: 

1. P ^ F and 

2. for each pseudo model Q of F which is a proper subset of P there is some 
FgF with QCFCP. 

Theorem 1. Let M be a finite set and let F he a set of subsets of M. Then the 
stem base 

{ap \ P a pseudo model of F} 

is an irredundant set of cumulated clauses, the models of which are precisely the 
elements of F. 

A proof of this result can be found in [7]. It generalizes a result of Duquenne 
and Guigues [5] which applies to the case that F is closed under intersections. In 
that case, the stem base consists of implications and can be handled efficiently 
because of the linear inference algorithm for implications. This is frequently used 
for algorithmic purposes. For practical applications, the implicational approach is 
almost sufficient, except that one would like to include some non-implicational 
meta knowledge, mostly about the structure of many-valued attributes. But, 
as already said, non-implicational propositional knowledge cannot be efficiently 
handeled. But there is way around this: Using cumulated clauses we can show 
that mixing in a few non-implications does not ruin the efficiency of conceptual 
computation: 
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Theorem 2. LetM he a set of cumulated clauses. The number of set operations 
necessary to decide if a given implication can he inferred from Af together with 
a given set TL of implications is (for fixed M) linear in the size ofTL. 

This theorem allows to extend the exploration techniques of Formal Concept 
Analysis to many valued contexts and, perhaps, to more complex conceptual 
structures that can be represented in context language. First implementations 
have already yielded promising results. 
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Abstract. The discovery and exploitation of symmetry plays a major 
role in sciences such as crystallography, quantum theory, condensed- 
matter physics, thermodynamics, chemistry, biology and others. It then 
should not be surprising then, since Conceptual Structures are proposed 
as a universal knowledge representation scheme, that symmetry should 
play a role in their interpretation and their application. In this tutorial 
style paper, we illustrate the role of symmetry in Conceptual Structures 
and how algorithms may be constructed that exploit this symmetry in 
order to achieve computational efficiency. 



1 Introduction 

1.1 Background and Motivation 

CGs have a great promise and many applications [2], natural language process- 
ing and understanding, knowledge acquisition, systems modeling, data mining 
to name a few . Development and research in this domain, however, has suf- 
fered from the belief that this promising technology will not scale, even on multi 
processor computers. Our research has given comfort to the skeptics. The first 
attempt (early 1996) at converting 100,000 English sentences to CGs and match- 
ing the CG database against a small set of queries took about 6.5 hours on a 
Sun Sparc Ultra enterprise 4000. Using a variety of techniques, including the 
progressive application of our associative database methods III through VI, we 
managed to reduce this to 19 seconds (of which 16 seconds is the overhead on- 
tology loading and 3 seconds is the actual processing time) on the same Sparc 
station (early 1997). Our goal is to further speed the process up so that this 
technology can actually be useful on real world size problems. 

In order to achieve the current level of efficiency, symmetry has been exploited 
in multiple ways. In this paper, we first outline the many ways symmetry arises 
in Conceptual Structures and their application and then we explain how such 
symmetry can be exploited in CG implementations. This paper is meant to 
increase deeper understanding of our previous published methods as opposed to 
presenting new methods. 

1.2 Symmetry and Diversity 

Our research group has found it to be a useful pedagogical and research tool 
to divide information into two classes: diversity and symmetry. Symmetry is 
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normally defined as “invariance with respect to transformation”. In this paper 
we will give a much looser definition of symmetry as “shared information between 
objects or systems and/or their representations”. The relationship between the 
traditional definition and our definition can be seen if , for example, one considers 
the bit level representation of two objects: 1001, 0111 where each bit represents 
a boolean feature “hair color is green.”, “is male”, “is living”, and “can fly”, 
respectively. In this representation we see that both sample objects share the 
feature that they can fly and, when plotted in a four dimensional Cartesian 
coordinate space, would be the same magnitude in the fourth dimension Further, 
this correspondence would not change if their values were changed (transformed) 
along the other dimensions. 

Our approach to symmetry may not be easy to accept or get used to at first, 
we hope that you, the reader, will let us humor you in the meantime. Also, 
some of what follows is speculative, but has proved valuable as a paradigm for 
developing, understanding and explaining our research and that of others. We 
argue that each of the following terms are virtually synonymous: symmetry, 
similarity, relationship, mutual information, redundancy, regularity, structure, 
constraint. In the case of mutual information, for example, knowing something 
about one information object tells us something about another as well. 

The theme of this paper is consistent with and support for the model of “com- 
puting as information compression, alignment and search” proposed by Wolff. 
[16] We consider different issues, however. Specifically, we explore how symme- 
try (potential compression) can be exploited cost effectively in graph operations 
and also what is the relationship between symmetry and the representation of 
knowledge in conceptual structures. 

1.3 Diversity Can Often Only Be Resolved Through Search 

Diversity or “potential surprise” - alternatively defined as “that which is left 
over in a system after symmetry has been removed” corresponds exactly with 
the notion of complexity (amount of computation required) used in algorithmic 
complexity analysis in Computer Science. Formal attempts to further charac- 
terize the information complexity of Algorithms have been developed such as 
Kolmogorov complexity [15]. Our goal is not to replace this work but to pro- 
vide a deeper intuitive understanding of the interplay between computational 
complexity and symmetry. 

We can view most algorithm design as the problem of constructing minimal 
cost functions from inputs to outputs. Practically speaking, one never encounters 
a problem with “unbounded input size” and knowledge of symmetries can be 
used to reduce the size of the set of possibilities that needs to be explored. But 
once all such symmetry is exploited all that remains to be done in traditional 
AI systems is to “try stuff’ . As possibilities are explored, further knowledge of 
relationships (symmetry) may be invoked leading to further savings (through 
search space pruning). Thus, part of the art of search algorithm design is to 
explore things in an order (such as in binary search) that is expected to reduce 
the amount of required future computation (diversity). 
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2 Exploiting Symmetry or Redundancy 

Note that we are not saying that “symmetry” by itself is useful. What we are 
saying is that the recognition of symmetry and subsequent exploitation of this 
knowledge can be useful when the cost of recognition and exploitation pays for 
the reduced complexity (diversity reduction) of future computations. In a sense, 
unrecognized or unexploited symmetry, is “false diversity”. 

As an example of exploiting symmetry, consider Peirce’s published proof 
procedure for Existential Graphs Inference [11]: ^ To prove an arbitrarily complex 
graph T: 

1. Start with an empty sheet. 

2. Using various rules of inference (Peirce gave 5 “alpha” rules which are suffi- 
cient to do the job) add, delete and change graphs on the sheet. 

3. The theorem T is proven if the graph T ever appears alone or alongside 
others on the sheet. 

At an ICCS95 evening session we proposed and proved the soundness and 
completeness of the following alternative “proof by contradiction” method, sim- 
ilar to that used in resolution refutation [12]: 

1. Start with an empty sheet. 

2. Add the negated theorem to be proved, T, to the sheet. (Negation is done 
by drawing a circle around T.) 

3. Using the alpha rules add, delete and change graphs on the sheet. 

4. The theorem T is proven if an empty circle (standing for ’’false” or contra- 
diction) ever appears on the sheet alone or alongside others. 

Although these two proof procedures accomplish the same thing, we argue 
that the second is superior for the following reasons: 

— Since we are moving from complexity (circled T, where T is arbitrarily com- 
plex) towards simplicity (empty circle), quite often, simplifying rules (dele- 
tion and erasure) are correct to apply. 

— Now all search attempts have the same goal! By exploiting this redundancy 
a human or mechanical theorem prover can more easily learn and transfer 
experience from proof to proof. 

— In particular, any intermediate sheet that occurs in a derivation of circle 
is a valid goal for future proofs, since all such sheets embody contradictions. 

Here, we have exploited “symmetry across computations”, exactly the mo- 
tivation behind the traditional Computer Science techniques of caching, memo- 
ization, and dynamic programming. 

^ Existential graphs (EGs) are a precursor of Sowa’s [14] conceptual structures. The 
following discussion does not require knowledge of the details of EGs. 
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3 Symmetry and Conceptual Structures 

In conceptual structures themselves, symmetry arises at the node level (due to 
types), graph level (due to analogy and existential graph inference rules) and 
database level (due to graph matching and storage levels). We will discuss each 
of these occurrences in the following sections. 



3.1 Concept Types and Type Hierarchy 

In Conceptual Structures, multiple concept labels may replace concept nodes in 
a conceptual structure and multiple relation labels may replace relation nodes in 
a conceptual structure. Such valid substitutions are normally stored in the ontol- 
ogy or type hierarchy that is associated with a Conceptual Structures database. 
In particular, words Y more specific than a word X share all of X’s constraints 
and may be substituted for the word X in an answer to a question. In fact, the 
word X may be viewed as an abstraction or name for its successors like Y. For 
example, when asked for an example of “long-tailed” animals, squirrels may be 
a valid answer. Thus, all squirrels are expected to be symmetric with respect to 
tail type. In our view, any two objects (represented as conceptual structures) or 
simple entries in a type hierarchy are symmetric with respect to the attribute 
or subgraphs they share. 



3.2 Relations as Symmetry 

Consider the statement, “X is the mother of Y” . Then all pairs, for example 
(Hillary Clinton, Chelsea Clinton), and (Mrs. Brady, Marcia Brady), share sym- 
metry as they are analogous with respect to satisfying the Mother-Of relation. 
It is this ability for relations to refer to equivalent pairs (or more than 2 - for 
non-binary relations) that gives them first class status in Conceptual Structures. 



3.3 Classification and Symmetry 

The above examples are meant to show the tight correspondence between classi- 
fication and symmetry: Two objects that are classified as being the same on one 
or more dimensions, will be treated analogously in decisions based on just those 
dimensions. Here “dimensions” are meant to be any attributes or features of the 
represented objects and need not be orthogonal or mutually independent. 

In a CG database, two objects represented by the same conceptual struc- 
ture, will be treated identically - regardless of which facts have been left out. 
What is not represented can not be exploited! Thus, classification is a very pow- 
erful tool that can be properly applied for efficient reasoning (as in scientific 
generalization) but also misapplied as in stereotypes of people by the color of 
their skin, sex, religion or birthplace, etc. 
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3.4 Graph-Level Symmetry 

Matching of graph-level structure occurs in a CG database during: 

— Query processing: An answer to a query Q is a supergraph of Q where node 
matching may invoke the type hierarchy. 

— Graph insertion: To insert a graph G into a GG hierarchy requires finding 
the immediate subgraphs and immediate supergraphs of G in the hierarchy. 

— Graph deletion: To delete a graph we must find it in the database. 

— Logical inference: In Peirce’s Alpha rule of deiteration [11] we are allowed to 
find and remove graphs (in a nested GG structure) that are the same as an 
outer graph. 

— Analogical Reasoning: Analogies occur when conceptual structures have iso- 
morphic or similar relational structure and concept nodes are related anal- 
ogously as well. Such analogical reasoning is one of the main benefits of the 
use of GGs compared to first order logic representations (such as PROLOG) 
that suppress or ignore node and relation topology. 

4 Supporting Operations 

The two main mathematical operations required to implement the graph match- 
ing operations above are graph isomorphism and subgraph-isomorphism. Infor- 
mally, in subgraph isomorphism, one is looking for a mapping between nodes in 
one graph G1 and nodes in a second graph G2 such that every edge-adjacency 
in G1 has a corresponding edge-adjacency in G2. In GGs, this matching also al- 
lows for some label relaxation due to the type hierarchy. In graph-isomorphism, 
the mapping must go both ways. Subgraph-isomorphism is known to be NP- 
complete, even for the special class of “bipartite graphs” that GGs fall into. It 
is an open problem whether graph-isomorphism is in class Polynomial or not. 

Hence, the required operations required in GG computation could be ex- 
pensive in the worst case. But the good news is that with a little cleverness 
(symmetry exploitation) their cost can be dramatically reduced for the average 
practical case. Furthermore, these costs can be shared (via symmetry) across 
the multiple graph matching operations that are required in a GG database. 
Specifically, efficiency is improved by using the set of design guidelines outlined 
in the next section. 

Note that one could also be interested in finding symmetries between nodes 
in a single graph G. This information could, possibly, be used to compact the 
representation of G and speedup further operations on G. In general, we do not 
exploit this type of symmetry in what follows. In fact we make use of the lack of 
symmetry in most natural language graphs to process them more efficiently since 
it is easy to distinguish their nodes without search, (i.e. most analogies between 
nodes fail immediately at the node label level or their immediate adjacencies). 

4.1 Conceptual Graph Implementation Guidelines [9] 

The following are proposed as useful guidelines not absolute principles. 
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REDUNDANCY REMOVAL: Every primitive data object, label or symbol 
should be stored only once with pointers used to denote the actual uses of 
the object. 

MINIMAL COMPOSITION: 

Every compound object should be stored with the minimum information 
required to represent the combination of its parts. 

PROPER GRAIN SIZE: 

Given no loss of accuracy, objects should be processed at the highest level 
of abstraction possible. 

A CONCEPTUAL GRAPH AS A SET OF TUPLES. 

If one were to implement a conceptual graph based on the diagrammatic 
representation, the costs associated with storage and matching would be 
much higher than they need to be. For example, the chemical shown in 
Figure 1, if represented as diagrammed, would require 9 nodes and 7 edges. 
Using the hypergraph-based tuple notation at the bottom of the figure we 
only need 4 nodes and 3 edges. 

In fact, most natural language CGs can be rendered into tuple notation 
without using any edges. Figure 2 shows the Display Form of the Conceptual 
Graph which reads: a monkey is eating a walnut with a spoon made out of the 
walnut’s shell. This CG has a cycle in it, so transcribing it into a Linear Form 
that can be typed into a computer requires some finessing. Some concept 
node needs to be picked as the head. Usually, the concept node with most 
arcs linked to it makes for the best choice for the head that produces the 
simplest CG. Picking [EAT] in our example for the head, yields the following 
Linear Form. 

[EAT] - 

(AGNT) — > [MONKEY] 

(OBJ) — > [WALNUT : *x] 

(INST) — > [SPOON] — > (MATH) — > [SHELL] <— (PART) 

<— [WALNUT : *x] . 

Note the usage of the symbol *x. It is used as a variable to denote an un- 
specified individual of type [WALNUT]. Both instances must be the same 
one. So, it is a binding variable. In the Tuple Notation, we have developed, 
this binding variable is not used. In our convention, a rose is a rose is a rose. 
In this case, a walnut is a walnut is a walnut. All occurrences of a concept 
node are the same, unless differentiated. In the tuple notation, one walnut 
is differentiated from another with a number designator. The following CG 
would be read as: a monkey is eating a walnut, with a spoon made from a 
shell of another walnut 

[EAT] - 

(AGNT) — > [MONKEY] 

(OBJ) — > [WALNUT. 1] 

(INST) — > [SPOON] — > (MATR) — > [SHELL] <— (PART) 

<— [WALNUT. 2], 
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Relation Table 

1 double bond [Oxygen, Carbon] 

2 single bond [Oxygen, Carbon] 

3 single bond [Oxygen,Nitrogen] 

4 single bond [Nitrogen, Carbon] 



Relation-Based CG Representation 



Fig. 1. CG Compaction 
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Fig. 2. A conceptual graph with a cycle 



Alternatively, instead of the concept node [EAT], the concept node [SPOON] 
could be the head, which would produce the following Linear Notation. 

[SPOON] - 

(INST) <— [EAT] - 

(OBJ) — > [WALNUT] — > (PART) 

— > [SHELL : *y] . 

(AGNT) — > [MONKEY] , 

(MATR) — > [SHELL : *y] . 

In the Tuple Based Notation, which we developed, the above would be ren- 
dered, as follows : 

OCGl ; { 



AGNT 


(EAT, 


MONKEY) , 


OBJ 


(EAT, 


WALNUT) , 


INST 


(EAT, 


SPOON) , 


MATR 


(SPOON, 


SHELL) , 


PART 


(WALNUT, 


SHELL) 



>. 

It is as if each conceptual relation is simultaneously the head of the Con- 
ceptual Graph without taking favorites and making it less accessible for a 
searching agent. Any subset of the total CG, can be isolated, and used as a 
means for retrieval, or a join against another utterance. At the same time, no 
strange variables have been introduced. This is consistent with the modern 
view of language as sets (as opposed to sequences) of conceptual constructs. 
RECURSIVE COMPOSITION: 

The same abstraction mechanism that goes from labels to graphs can be 
taken one step further to facilitate the storage and retrieval of nested graphs. 
TOPOLOGICAL INFORMATION: 

A graph is itself the best descriptor of its nodes. A graph is the most compact 
representation of all the adjacencies and shortest paths between its nodes. 
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5 Six Organizational Methods 

The design guidelines have been used to develop six database retrieval methods 
for Conceptual Graphs that have been discussed in previous papers by our group 
[7] and others. Each method exploits a larger amount of symmetry and hence 
(due to diversity reduction) achieves a deeper level of storage and retrieval effi- 
ciency. Complexity counts are rough estimates where each graph comparison is 
considered 0(1). 

— METHOD I: Arbitrary Flat Ordering. This is the naive system (used as a 
reference point to compare the other systems to) which involves no symmetry 
exploitation. Graphs are placed in a file in a random ordering. Space costs 
and retrieval costs are maximal. 

— METHOD II: Two-level Ordering. Common features (subgraphs) of database 
graphs are used to create a set of “screens” that are used to prune the number 
of graphs that need to be accessed at the higher level. The addition of these 
screens increases storage costs over Method I but reduces retrieval costs by 
some constant factor K. Many fewer graph comparisons are required at the 
highlevel but all graphs must still be compared at the screen level. 

— METHOD HI: Multi-Level Partial Order. The full partial order relation 
“subgraph-of’ is imposed on the database graphs and the relation is stored 
by storing its transitive kernel (transitive arcs have been removed) in a hier- 
archy. [6]. Further graphs may be added to the database to further balance 
and integrate the partial order. This is the main method currently in use 
in most semantic network and conceptual graphs implementations. [13,1] 
We see that storage costs have further increased, as we are still storing 
more pointers, but retrieval costs have dropped dramatically. Empirically 
such databases require log**2(N) graph comparison on a database with N 
graphs and provide access times that are respectable for many real-time ap- 
plications. Gerard Ellis and Steven Barnes have provided implementation 
enhancements to the original Method HI. 

— METHOD IV: Hierarchical Node Descriptor Method. This method, devel- 
oped with Gerard Ellis [8], exploits further symmetry between the graphs 
by discovering that neighborhoods of nodes of a given label (such as Gar- 
bon in chemistry) often are the same or similar as node neighborhoods in 
other graphs. Furthermore, these neighborhoods may be broken up into “de- 
scriptor units” which involve two nodes and their path distances. Descriptor 
units are shared across many node descriptors. Thus, storing graphs as sets 
of node descriptors, and node descriptors as sets of descriptor units provided 
substantial savings at the storage and graph matching levels. 

However, this method proved rather complicated to implement and in the 
18 month process of doing so. Methods V and VI were discovered! Fortu- 
nately, by further exploiting symmetry and special characteristics of concep- 
tual graphs, we are now able to get the power of Method IV but with a much 
simpler implementation. 

— METHOD V:Shared (symmetric) Node Mappings. 
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Method V was described in one paragraph in the UDS Paper [9]. In Method 
III Phase I, a node X is not queried unless each of its predecessors has 
successfully matched as a subgraph (generalization) of the query graph. The 
idea of Method V is to exploit the work that has been done to find proper 
node bindings between the predecessor CGs and the query when finding 
proper node bindings between X and the query. In particular, each set of 
bindings must be consistent with (and hence only augments) all previous 
sets of bindings found. 

In order to bootstrap in this way, node binding information is added to all 
the predecessor links in the traditional Method III hierarchy. For each link 
A— > B we store (as a boolean matrix) for each node in A, which nodes in 
B it could possibly bind to in a correct subgraph-isomorphism match. 

— METHOD VI: The Universal Data Structure. 

Method VI is described in detail in the UDS paper. It exploits every design 
principle outlined above. By maximally exploiting symmetry both storage 
costs and retrieval times are reduced! In fact, it uses 1/ 100th of the stor- 
age of Method III while producing a 100th fold speedup. The reason it is 
called a Universal Data structure, is that we propose that it can be used as 
an implementation for any of the following traditional AI methods: concep- 
tual graphs, semantic networks, relational databases, RETE rule matching, 
neural network learning and automatic theorem proving. ^ 

In Method VI (see Figure 3), primitive node labels are stored (in a hash table) 
at the bottom of the hierarchy, tuples are stored as sequences of pointers 
to the node table, and graphs are stored as sets of pointers to the tuple 
table. Nested graphs point to the graph table, etc. Tuple tables essentially 
take the structure of relational databases. With just an elaborate set of 
pointers being stored, you might legitimately ask “but where is the beef?” 
the truth is that when on a diet high in symmetry, beef is not necessary: 
two structures (nodes, tuples, graphs) match if they have exactly the same 
pointers or, if a comparison is needed, their set of pointers is a subset of 
the other set (modulo the type and graph hierarchies). This definition of 
matching is sufficient for more than 99 percent of the graphs that we have 
seen arising from CG representation of natural language sentences. Difficult 
graphs such as those that occur in VLSI logic and organic chemistry will 
require further processing - due precisely to the fact that large amounts of 
apparent symmetry in the graph nodes (carbon atoms, for example, as in 
Figure 1) need to be distinguished based on higher level graph structures. 

6 Further Enhancing the Methods: Parallelism 

The idea of parallelism, that several computations may take place at the same 

time given adequate hardware, is yet another method of exploiting symmetry. 

By adding the redundancy of multiple processors speedup up on the order of the 

^ Work on applying Method III hierarchies to automatic first order logic theorem 
proving has been done with John Esch [3]. 




506 



Robert Levinson 



person 

/\, 

man girl 

/ 

Dan Frank Sue 



Node Hierarchy 



food 

/\, 
pie apples 



1 eat [agent, obj,manr] 

2 eat [person, , ] 11 eat [ , ,quickly] 




Relation Hierarchy 



5 cont [whole,part] 

I 

6 cont[pie,apples] 



9 pos [owner,owned] 

t 

10 pos [Sue, pie] 




G6 {6,8,10} 



Fig. 3. Universal Data Representation 



number of added processors is possible. Interestingly it is the “independence” 
(lack of symmetry or relationship) between the results and partial results of the 
computations that allows us to parallelize them. 

James D. Roberts has designed parallel hardware and algorithms for enhanc- 
ing CG computation. He recognized that many of the graph comparisons re- 
quired by Methods III-VI can be scheduled independently of each other. He also 
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recognized that when comparing two graphs for subgraph-isomorphism using 
the technique of “refinement”, that efficiency may be enhanced by parallelizing 
underlying boolean matrix operations. His considerations lead to an expected 
graph retrieval time of 0(n**2/logN) using only 0(logN**2/b) processors. [5]. 
As stated above Method III-VI systems require 0(log(N)**2 ) graph compar- 
isons. 

7 The CG Mars Lander System 

The CG Mars Lander [4] is a retrieval system designed for the purpose of an- 
swering factual questions about a body of documents. The results reported here 
are for a Method III style implementation - a Method VI style implementation 
is underway that will also integrate with Protege [10]. English discourse, coupled 
with ontology and queries are inputs to the system, and the result is an English 
answer with appropriate references to where in the English discourse the query 
gets answered. 



7.1 Summary of Results and Timings 

Our current program can accept a large set of CGs (tens of thousands), an 
ontology of some few hundred thousand words, and a set of queries. The program 
stores the CGs in a database which can be saved and restored, and answer queries 
by returning relevant CGs from the previously saved DB. The timing statistics on 
a Sun Ultra Enterprise 4000 (with 4 UltraSPARC 167Mhz and 512KB External 
Cache CPU and 256BM of main memory) are as follows: Read, process, and 
store an 18,000 CG input file in 1 hour and 46 minutes. Reloading of above 
DB takes on the order of seconds. A 150,000 word ontology is processed in 16 
seconds. Each query is handled in 5.5 seconds. For smaller databases (hundreds 
of CGs only), the time to handle a single query can be as low as 0.2 seconds. A 
typical CG consists of some ten tuples, each of which has two to five arguments. 
Some large CGs can, however, reach up to 30 tuples (with no effective limit in 
the program). 



7.2 Cost Benefit Analysis 

Suppose you only plan to do a small number of queries Q, over a database of N 
CGs. The question arises whether it is worth the overhead of creating a Method 
III database, in which retrievals and insertions take approximately log^g(iV) 
comparisons for a database of size N . The alternative is simply to compare each 
query to every CG, ignoring symmetry across queries. 

Here we do the necessary math to aid in the decision: No pre-compilation of 
database: cost is N * Q graph comparisons, (call this Method I) 

Creating method HI database: average cost per insertion is approximately 
logio(^), giving a cost to create the database of Alogio(^) a cost to answer 
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Table 1. Comparison table between costs of respective methods 



N 


Q 


Method I Cost 


Method III Cost 


10 


1 


10 


5.00 


10 


10 


100 


14.90 


10 


100 


1,000 


104.80 


100 


1 


100 


296.60 


100 


10 


1,000 


328.64 


100 


100 


10,000 


688.60 


1,000 


1 


1,000 


7,293.43 


1,000 


10 


10,000 


7,374.44 


1,000 


100 


100,000 


8,184.40 


1,000 


1,000 


1,000,000 


16,284.44 


10,000 


1,000 


10,000,000 


152,823.78 


10,000 


10,000 


100,000,000 


296,823.78 



the queries of Qlogio(iV). So, the question is when NQ is < (iVlogio(^) + 
QlogiQ(-/V)). Table 1 contains the results. 

From this it can clearly be seen that Method III is most cost effective except 
for very small ratios of queries to database size, and for very high ratios the 
savings grow exponentially. Methods IV-VI provide further order of magnitude 
improvements over these numbers. 
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Abstract. This paper provides an introduction to SNePS 3, the latest 
entry in the SNePS family of knowledge representation and reasoning 
(KRR) systems. The emphasis is on SNePS 3 as an example of a logic- 
based network KRR system. 



1 Introduction 

SNePS 3 is the latest entry in the SNePS family [1] of knowledge representation 
and reasoning (KRR) systems.^ It is based on SNePS 2.5 [2] and ANALOG 
[3,4], and is currently being implemented in CLOS, the Common Lisp Object 
System [5]. This paper provides an introduction to SNePS 3 as an example of 
a logic-based network KRR system. In the rest of this paper, I will use the 
term “SNePS” when discussing features that are generally true of formalisms 
in the SNePS family, and “SNePS 3” when discussing features that distinguish 
SNePS 3 from other family members. 

Due to space considerations, I will only be able to provide an informal and 
incomplete introduction to SNePS 3. Section 6 will be a more formal, but still 
incomplete, introduction to the syntax of the SNePS 3 language. 

Informally, information is represented in SNePS as a network of nodes and la- 
beled directed arcs. An arc label may be used on arbitrarily many different arcs, 
but each node has a unique identifier. For example, Fig. 1, based on a project in 
which Cassie, a SNePS-based agent [6] , is embodied as a Foveal Extra- Vehicular 
Activity Helper- Retriever (FEVAHR) robot [7], represents the following infor- 
mation:^ 

Cassie is talking to and looking at Stu. Cassie is a FEVAHR. FEVAHRs 
are robots. Stu and Bill are people. People and robots are agents. 

How the network of Fig. 1 represents this information, i.e., the syntax and 
semantics of SNePS networks, will be explained below. 

^ “SNePS” originally stood for “Semantic Network Processing System.” Now, I would 
prefer it be thought of as a name in its own right. 

^ Fig. 1 shows only a small part of the FEVAHR knowledge base, chosen to illustrate 
the issues discussed in this paper. 



B. Ganter and G.W. Mineau (Eds.): ICCS 2000, LNAI 1867, pp. 510—524, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




An Introduction to SNePS 3 



511 




superclass 




Fig. 1. A SNePS network representing the information that: Cassie is talking to Stu 
(M12!); Cassie is looking at Stu (M16!); Cassie is a FEVAHR (M7!); FEVAHRs are 
robots (M31!); Stu and Bill are people (M20!); people and robots are agents (M33!). 
Tables 1-4 give the intended semantics of each node. 



2 KRR Systems as Logics 

Every KRR system is a logic, in the sense that it has, or should have, a well- 
defined syntax, compositional semantics, and inference procedure. SNePS has 
been designed specifically as a logic to support natural language (NL) compe- 
tent agents [3, 4, 6, 8, 9]. SNePS constitutes the “language of thought” of a SNePS- 
based agent. The domain of discourse of that language, the set of entities that 
are denoted by well-formed SNePS expressions, is the domain of all mental en- 
tities conceivable by the agent. Adding a new expression to a SNePS network 
implements the agent’s conceiving of the entity represented by that expression. 
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The basic SNePS principles are 

Propositional Semantic Network: The only well-formed SNePS ex- 
pressions are nodes. 

Term Logic: Every well-formed SNePS expression is a term. 

Intensional Representation: Every SNePS term represents (denotes) 
an intensional (mental) entity. 

Uniqueness Principle: No two SNePS terms denote the same entity. 

Paraconsistent Logic: A contradiction does not imply anything what- 
soever. 

The significance of SNePS being a propositional semantic network is that 
only nodes are well-formed expressions having semantics, whereas in many other 
network-based KRR systems, including many semantic networks, arcs denote 
propositions — relations that are asserted to hold between entities. (See [10] for 
a discussion of assertional vs. structural information.) 

The significance of SNePS being a term logic is that proposition-denoting 
terms may be arguments of other terms without leaving first-order logic [11]. 

The significance of SNePS using intensional representation is that cogni- 
tively distinct mental entities are denoted by distinct terms even if they are 
co-extensional — the entire network forms an opaque context in which there is 
no substitution of equals, because no two terms are fully equal [12,13,14,15]. 
This principle can also be read as Every SNePS term denotes a mental entity,” 
which means that no terms are created for purely technical reasons, and that 
even the analogue of logical variables, “variable nodes”, have consistent com- 
positional semantics denoting mental entities. Variable nodes did not have such 
semantics in earlier versions of SNePS [16]. Supplying that was a main motivator 
of ANALOG [3,4] and SNePS 3. 

The Uniqueness Principle supports intensional representation, and imposes 
a requirement of structure-sharing on the implementation — no two distinct but 
structurally equal terms exist in a network. 

SNePS being a paraconsistent logic means that a contradiction in one area 
of the knowledge base would not “corrupt” the information in another area of 
the knowledge base. 

3 Levels 

To a large extent, SNePS is a logical-level semantic network [17]. Just as a Prolog 
programmer must choose the predicates to be used in a Prolog program, a SNePS 
user must choose the arc labels and other representational techniques to use in a 
SNePS-based application. This person is called a “knowledge engineer”, and the 
job of a knowledge engineer has been referred to as “conceptualization” [18, p. 
9]. For example. Figures 2 and 3 show two possible ways of representing Cassie 
is a FEVAHR, and another is shown in Fig. 1. 

However, just as the syntax, semantics, and inference mechanism of Horn 
clauses is built into Prolog, there is a level at which syntax, semantics, and an 
inference mechanism is built into SNePS. These will be discussed below. 
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Fig. 2. A possible representation of Cassie is a FEVAHR. 




Fig. 3. Another possible representation of Cassie is a FEVAHR. 

entity 



proposition act individual 



simple proposition generic proposition rule 

Fig. 4. The initial taxonomy of SNePS semantic classes. 



4 Semantics 

Every SNePS 3 node must be given a semantic class recognized by the sys- 
tem. The initial semantic taxonomy, which may be extended by the knowledge 
engineer, is shown in Fig 4. 

Proposition nodes have an assertional status that is recognized by the infer- 
ence mechanism (see Sect. 5 below). Rule nodes are proposition nodes that can 
be used for node-based inference. An example is shown in Fig. 5. Generic propo- 
sition nodes, such as M89 ! in Fig. 6, can be used for subsumption-based inference. 
Simple proposition nodes are proposition nodes that are neither generic propo- 
sitions nor rules. Act nodes may be performed by the SNeRE acting executive 
[19,20], [2, Chap. 4].^ Individual nodes are nodes that are neither proposition 
nodes nor act nodes. 



® Due to space limitations, act nodes and the acting executive will not be discnssed 
in this paper. 




514 



Stuart C. Shapiro 




Fig. 5. M88 ! is a rule node denoting the proposition, If Cassie is a robot that talks, 
then Cassie is intelligent. 



The SNePS 3 node classes and the categories of entities represented by those 
nodes as expressed in the SNePS formalism may be redundant, but this is because 
we, as SNePS 3 designers, did not want to specify a priori how the knowledge 
engineers could choose to represent categories. For example, node M6 of Fig. 1, 
node FEVAHR of Fig. 2, and node B2 of Fig. 3 might each represent the category of 
FEVAHR robots in different conceptualizations. When necessary to distinguish 
these two levels of semantics, I shall use the terms “SNePS semantics” and 
“domain semantics.” 

The intended domain semantics (intended by me, the knowledge engineer of 
this project) of the nodes in Fig. 1 are shown in Tables 1-4.^ There, |n] is used 
to refer to the denotation of the SNePS term n. Lexemes, agents, times, cate- 
gories, actions, acts, events, and propositions are all assumed to be entities in the 
domain of discourse, which, in this case, is the universe of Cassie’s mental enti- 
ties. Act-denoting terms are nodes in the SNePS act class, proposition-denoting 
terms are nodes in the SNePS proposition class, and all other terms are nodes 
in the SNePS individual class. 



^ Table 2 gives the semantics of B2 as “an interval of time.” Cassie/FEVAHR contains 
a variable, NOW, whose value is the term denoting the current time, and which changes 
as time moves [21]. At the time of the snapshot of Fig. 1, B2 is the value of NOW, 
which is why Cassie’s talking and looking is expressed in the present tense. 
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Table 1. The semantics of lexeme-denoting terms from Fig. 1 



agent: The English lexeme 
Bill: The English lexeme 

Cassie: The English lexeme 
FEVAHR: The English lexeme 
look: The English lexeme 

person: The English lexeme 
robot: The English lexeme 
Stu: The English lexeme 

talk: The English lexeme 



“agent” . 
“Bill”. 
“Cassie” . 
“FEVAHR” . 
“look”, 
“person” . 
“robot” . 
“Stu”. 
“talk”. 



Table 2. The semantics of agent-, time-, and category-denoting terms from Fig. 1 



Bl: Cassie. 
B5: Stu. 

B6: Bill. 



B2: An interval of time. 
B3: An interval of time. 
B4: An interval of time. 



The category of FEVAHRs. 
The category of people. 

The category of robots. 

The category of agents. 



M6: 

M19: 

M22: 

M32: 



Table 3. The semantics of action-, act-, and event-denoting terms from Fig. 1 



Ml : The action of talking. 

M13 : The action of looking. 

MIO: The act of talking to Stu. 

M14: The act of looking at Stu. 

Mil : The event of Cassie’s talking to Stu. 
M15 : The event of Cassie’s looking at Stu. 



Table 4. The semantics of proposition-denoting terms from Fig. 1 



M7! : 
M8! : 
M9! : 
M12! 
M16! 
M17! 
M18! 
M20! 
M31! 
M33! 



The proposition that 
The proposition that 
The proposition that 
The proposition that 
The proposition that 
The proposition that 
The proposition that 
The proposition that 
The proposition that 
The proposition that 



Cassie is a FEVAHR. 

Cassie’s name is “Cassie”. 

|B2| is a subinterval of [B3] and |B4]. 
Cassie is talking to Stu throughout |B3]. 
Cassie is looking at Stu throughout |B4]. 
Bill’s name is “Bill” . 

Stu’s name is “Stu” . 

Stu and Bill are people. 

FEVAHRs are robots, 
robots and people are agents. 
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Fig. 6. M89 ! is a generic proposition node denoting the proposition that Any robot that 
talks, is intelligent. 



5 Inference Methods 

A SNePS-based agent might believe only some of the propositions represented 
in the network. For example, Cassie might believe that Bill believes that Stu 
is tall, without herself believing it. We notate believed propositions by append- 
ing an exclamation mark (!) to the identifiers of nodes that denote believed 
propositions, and we refer to such nodes as asserted. All the proposition nodes 
of Figures 1-3 are asserted. In Fig. 5, nodes M82, M85, and M87 are unasserted 
proposition nodes, and node M88 1 is an asserted proposition node. 

Inference in SNePS is a method of computing newly asserted nodes from 
previously asserted nodes. Thus, inference causes the agent to have new beliefs. 
The newly asserted nodes might previously have been in the network, denoting 
then unbelieved propositions, or they might be created by the inference engine. 

There are four inference methods in SNePS, the first two of which distinguish 
SNePS from logic-based but non-network-based KRR systems. Briefly, and in- 
formally, these are: 
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Fig. 7. Node M20 ! , denoting the proposition that Stu and Bill are people, implies M59 ! , 
denoting the proposition that Stu is a person, by wire-based inference. 



Wire-based inference, referred to as “reduction inference” in pre- 
vious versions of SNePS (notably in [22]), is an inference method 
whereby a proposition node ml implies a proposition node m2 with a 
subset or a superset of mfs arcs. (See Secs. 6.1 and 6.2.) For exam- 
ple, node M20 ! in Fig. 7, repeated from Fig. 1, implies node M59 ! by 
wire-based inference. 

Path-based inference [22,23,24] is an inference method whereby a 
path of arcs from a proposition node ml to a node m2 (of any class) 
may imply a proposition node m3 with all of mfs arcs plus an addi- 
tional one to m2. Path-based inference must be sanctioned by path- 
based inference rules that are given to the SNePS system, but not 
represented in the SNePS object language. (See [2, Sect. 2.5.2] for the 
syntax and semantics of paths.) One possible path-based inference 
rule is 

(define-path class 
(compose class 

(kstar (compose subclass- ! superclass)))) 
Given this rule, node M60 ! of Fig. 8, denoting the proposition that 
Stu is an agent, may be inferred from M59 ! , denoting the proposition 
that Stu is a person, and the path of arcs from M59 ! to M32 by path- 
based inference followed by wire-based inference so that there is only 
one class arc emanating from M60 ! . 

Node-based inference uses nodes that are the analogue of non-atomic 
formulas in first-order predicate logic to represent domain rules that 
SNePS can use to infer newly asserted proposition nodes from previ- 
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Fig. 8. Node M60 ! , denoting the proposition that Stu is an agent, may be inferred from 
M59 ! , denoting the proposition that Stu is a person, and the path of arcs from M59 ! to 
M32 by path-based inference followed by wire-based inference. 



ously asserted proposition nodes. For example, in Fig. 5, node M88 ! 
is an “and-entailment” — a kind of entailment with M82 and M85 as 
conjoined antecedents and M87 as a consequent. M88! denotes the 
domain rule, If Cassie is a robot that talks, then Cassie is intelli- 
gent. Since M82 follows from Fig. I’s M7! by path-based inference, 
and M85 follows from Fig. I’s M12 ! by wire-based inference, M87 may 
be inferred by node-based inference. 

Subsumption inference is an inference method in SNePS 3 that is 
not available in earlier versions of SNePS, and uses the ANALOG 
[3,4] technique of representing variables. For example. Fig. 6 shows 
a SNePS 3 representation of the generic proposition. Any robot that 
talks, is intelligent. VI is a structured variable that denotes the arbi- 
trary talking robot [25], R1 and R3 are restrictions on VI, and M89! 
denotes the asserted generic proposition. Since R1 subsumes Fig. 5’s 
M82 and R3 subsumes Fig. 5’s M85 and those nodes follow from Fig. 1 
as explained above, therefore VI subsumes Fig. I’s (and Fig. 5’s) Bl, 

M89 ! subsumes Fig. 5’s M87, and M87 follows from M89 ! by subsump- 
tion inference. 

6 Syntax 

The syntax of the SNePS 3 language is defined in terms of nodes, relations, wires, 
cables, and cablesets (see also [3,4,22]). Informally, a node is what I have been 
calling a node heretofore, and a relation is an arc label. A wire is a pair, (r, n), 
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of a relation r and a node n. A cable is a pair, (r, ns) of a relation r and a set of 
nodes (or “nodeset”) ns. A cableset is a set of cables, {(ri,nsi), . . . , (rk,nsk)}, 
such that no two are the same. A set of wires all of which have the same 
relation forms a cable. A set of wires with different relations forms a cableset. 
Nodes that I have been informally presenting as labeled circles or ovals with arcs 
emanating from them formally are cablesets. 

With the previous paragraph as introductory, and the foregoing sections as 
motivational, I can now give a more formal presentation of SNePS 3 syntax. 



6.1 Relations 

A SNePS 3 relation is a four-tuple, {name , type ^ adjust ^ limit) , where the four 
constituents are: 

name: A symbolic name of the relation. No two relations may have the same 
name. Relation names were used in Figures 1-8 as labels on directed arcs. 
type: The SNePS semantic class of the nodes in the range of this relation, i.e., 
of the nodes pointed to by arcs labeled with this relation. 
adjust: Either expand, reduce, or none. This specifies how wire-based inference 
treats instances of this relation. For example. Fig. 7 illustrates the reducibil- 
ity of the relation named member. However, the relation named &ant, shown 
in Fig. 5 is expandable. A value of none means that this relation is not 
subject to wire-based inference. 

limit: The minimal size of a nodeset that can be paired with this relation in a 
cable. If adjust is reduce, then limit is the minimal allowed reduction. For 
example, to prevent Fig. 7’s node M20 ! from implying a node with only a 
class arc and no member arcs, the limit of the relation named member is 1. 
If adjust is expand, then limit is the minimal nodeset-size of a cable that 
allows more wires to be added. For example, it does not make sense to add 
&ant arcs to a node that is not a kind of entailment. 

Assuming that the knowledge engineer has declared the classes category, 
event, and time to be subclasses of the SNePS class of individual nodes, some 
example relations would be: 

(member, entity, reduce, 1) (class, category, reduce, 1) 

(event, event, reduce, 1) (time, time, reduce, 0) 

(&ant, proposition, expauid, 1) (cq, proposition, reduce, 1) 



6.2 Case Ftames 

A case frame is a pair, (c,p), where c is a SNePS semantic class and p is set of 
relations. Informally, c is the class of all nodes that have the given set of arcs 
emanating from them. A more formal explanation is given below. 

Case frames are partially ordered by the relation adjustable (3adj). Case 
frame (ci,pi) is adjustable to case frame ( 02 ,^ 2 ) ((ci,pi) ^adj (c 2 ,P 2 )) iff 
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node 

I ^ I 

base molecular 

variable restriction closed 

I ^ I I ^ I 

universal existential atomic non-atomic 

and-entailment 

Fig. 9. Part of the hierarchy of the syntactic classes of SNePS nodes. 



1. Cl = C 2 ; 

2. every relation in pi — p 2 is reducible and has limit=0; 

3. every relation in p 2 — pi is expandable and has limit=0. 

For example, the fact that Fig. 5’s M85 follows from Fig. I’s M12 ! by wire-based 
inference is partially justified by the fact that 

(proposition, {(event, event, reduce, 1), (time, time, reduce, 0)}) 

— adj 

(proposition, {(event, event, reduce, 1)}) 

6.3 Nodes 

Before entering nodes into a SNePS 3 network, the knowledge engineer must 
declare the case frames to be used. The system will check that relations are being 
used consistently. The knowledge engineer may also add additional semantic 
classes to the SNePS 3 class hierarchy, and use these new classes in the case 
frames. 

A valid SNePS 3 node must be in one of the syntactic classes in the SNePS 3 
syntactic hierarchy, part of which is shown in Fig. 9. 

A base node consists only of an identifier. A base node identifier may be 
created by a SNePS 3 user. Some examples of these in Fig. 1 are Cassie, robot, 
and look. Other base nodes may be created and named by the system. Examples 
of these in Fig. 1 are B1 and B2. When a base node is created, it must be given 
a SNePS 3 semantic class. 

An atomic closed molecular node is a cableset {(ti, nsi), . . . , (r^, nsk)} such 
that: 

1. the set {ri, . . . rfc} is the relation set of some declared case frame (c, p); 

2. for each i, each node in nsj is of the semantic class which is the type of rp 

3. for each i, the cardinality of nsi is at least the limit of r^. 

If a cableset satisfies these three conditions, it is considered to be an instance of 
the case frame {c,p), and is entered into the network as a node in the semantic 
class c. Closed molecular nodes are given identifiers of the form Mn by the system. 
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We can refer to a cableset either by using the cableset notation, using just 
its identifier, or using both. For example, node M20 ! of Figures 1 and 7 may be 
referred to as M20! : {(member, {B5, B6}), (class, {M19 : {(lex, {person})}})}. 

SNePS 3 knowledge engineers may create any case frames they desire to 
form atomic closed molecular nodes. The syntax of non-atomic, variable, and 
restriction nodes, however, are fixed. 

Non-atomic molecular nodes are used to represent rules used for node-based 
inference. For example, node M88! of Figure 5 is an and-entailment, the case 
frame for which is 

(rule, {(&ant, proposition, expand, 1), (cq, proposition, reduce, 1)}) 

See [2, Chap. 3] for the syntax of the other non-atomic nodes. 

A universal variable node is an instance of one of the case-frame schemata 

(c, {(any, restriction, expanid, 1)}) 

and an existential variable node is an instance of one of the case frame schemata 

(c, {(some, restriction, none, 1), (depends, universal, none, 0)}) 

where c is any semantic class, restriction is any restriction node, and universal 
is any universal variable node. Variable nodes are given identifiers of the form 
Vn by the system. 

A restriction node is an instance of any case frame declared by the knowledge 
engineer, except that every restriction node must dominate at least one variable 
node. Thus, variable nodes and restriction nodes form cycles in the SNePS 3 
network. Two such cycles are illustrated in Fig. 6: Vl-Rl-Vl and V1-R3-R2-V1. 
Restriction nodes are given identifiers of the form Rn by the system. 

One more relation is needed for the SNePS 3 syntax of generic propositions: 
(close, MnzuersaZ, none, 0). Nodes from which close arcs emanate are taken to 
form the scope of universal variable nodes that the close arcs point to, as well 
as all existential variable nodes dependent on them. An example of the need for 
this is to distinguish Any talking robot is not intelligent, from It is not the case 
that any talking robot is intelligent. A close arc is shown in Fig. 6. 

7 Summary 

SNePS 3, the latest version of the SNePS family of propositional semantic net- 
works, is a logic- and network- based knowledge representation, reasoning, and 
acting system, based on a paraconsistent, first-order term logic, with composi- 
tional intensional semantics. 

SNePS 3 differs from earlier versions of SNePS by having the following fea- 
tures: 

1. formal SNePS semantic classes of nodes 

2. formal definition of relations 
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3. formal definition of case frames 

4. structured variables 

5. wire-based inference, including expansion as well as reduction, and limits on 
adjustment 

6. subsumption inference 

8 Availability and Use 

SNePS 3 is currently being implemented. SNePS 2.5 has been implemented in 
ANSI Common Lisp and runs on any platform where that language is installed. 
SNePS 2.5 is useable for research and experimentation, and can currently handle 
knowledge bases on the order of about 1,000 SNePS nodes. It and other versions 
of SNePS are currently in use at various sites around the world, including the 
U.S., Portugal, Italy, and Japan. 

The SNePS 2.5 source code and manual may be freely downloaded from the 
SNePS Research Group web pages at http://www.cse.buffalo.edu/sneps/, 
along with a tutorial, sample demonstration runs, and a bibliography. 

9 Benchmarks and Comparisons 

SNePS comes with a suite of demonstration problems and applications that can 
be used to familiarize oneself with how to use it, and may be used for comparison 
with other systems. Demonstrations that were taken from other sources include 
The Jobs Puzzle from [26, Chapter 3.2], Schubert’s steamroller problem (see 
[27]), and a database management system example from [28]. 

Schubert’s steamroller problem was run on a 1993 version of SNePS [29], and 
the results compared with those reported in [27]. The SNePS version produced 
fewer unifications and was faster than most unsorted logic solutions, but was 
outperformed by sorted logic solutions. The current version of SNePS is much 
faster on this problem than the 1993 version, partially due to improvements in 
SNePS, and partially due to faster, bigger computers. 

The SNePS representation of the Jobs Puzzle is much simpler and closer to 
the English version of the puzzle than the clause form representation presented 
in [26, p. 58ff]. 
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Abstract. Network information system specification in virtual profes- 
sional communities requires a legitimate user-driven approach. In such 
an approach, only specification changes are produced that are not only 
meaningful but also acceptable to all users. To do so, for each requested 
change, a relevant user group needs to be selected to work out the re- 
quired knowledge definition changes. This paper describes the mecha- 
nism through which such a relevant user group can be calculated. The 
dynamics of the composition norms that guide the required specification 
behaviour are explained. The conceptual graph notation for four cate- 
gories of specification knowledge is given. The Peirce conceptual graph 
workbench is used to demonstrate the composition norm dynamics cal- 
culation. 



1 Introduction 

Collaborative work is increasingly being done in a distributed fashion, supported 
by commonly available Internet-based information tools such as mailing lists or 
the web. We define the virtual professional communities in which such collab- 
oration is to take place as communities or networks of professionals whose col- 
laboration on activities required to realize shared goals is mostly or completely 
computer-enabled. The workflows of these communities are often supported by 
network information systems consisting of linked and configured standard infor- 
mation tools. The communal requirements and systems typically evolve strongly, 
with the users having an important role both as sources and as modellers of 
the system specifications. Active user participation in the specification process 
of such continuously evolving network information systems is very important, 
since community members have the most detailed knowledge about when break- 
downs in work arise and how they can be resolved. One significant weakness of 
the traditional methods supporting specification processes is that they do not 
sufficiently involve the users (see [3] for a detailed study). They tend to rely on 
external analysts controlling the specification process, leaving the users only the 
rather passive role of being interviewed by them. Other methods, in particular 
socio-technical specification methods (such as Soft Systems Methodology), often 
overinvolve users in the sense of letting them participate in every conceivable 
change process. To increase the efficiency and willingness of users to participate 
in change processes, it must therefore be exactly known for each specification 
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process what subset of all users is to take part, and in which capacity these are 
to be involved. To adequately determine the relevant user group, a legitimate 
user-driven specification approach is required. First, in such an approach, the 
community members are not just to provide specification knowledge, but also to 
control the process in which this knowledge is produced, thus making the spec- 
ification proces truly user-driven. Second, the members of virtual professional 
communities, like their counterparts in traditional communities, are guided in 
their work by shared social norms. These norms should govern both the oper- 
ations of a network and the specification processes in which the network and 
its information system is being defined. As these networks are egalitarian in na- 
ture, such norms cannot be imposed from above, but should originate from the 
community as a whole. Thus, the user-driven specification process needs to be 
legitimate as well, in the sense that specification changes are not only mean- 
ingful, but also acceptable to all members of the community. We call the norms 
that regulate the acceptable specification behaviour of the members of a virtual 
professional community composition norms (while we refer to the norms that 
regulate operational workflow behaviour as action norms). 

The RENISYS (REsearch Network Information SYstem Specification) 
method [3] supports such a legitimate user-driven approach. It allows users facing 
a breakdown in their work to identify problematic knowledge definitions which 
they feel should be changed. For each of these definitions, RENISYS calculates 
the relevant user group, which it provides with the appropriate related knowl- 
edge definitions and the discussion environment needed for the group to work 
out the acceptable definition changes. 

In [2], we explained how ontological and normative knowledge can be repre- 
sented in conceptual graphs, and how these knowledge categories can be used to 
produce legitimate knowledge definition changes. In [4], we explored how the con- 
text lattices proposed in [8] can be applied to to efficiently structure, query, and 
update composition norms. Based on this work, we now show how the RENISYS 
method uses conceptual graph theory to determine the exact relevant user group 
required for a particular required specification change. To do so, the composition 
norm dynamics need to be calculated. In this way, a set of applicable norms can 
be calculated for each user and composition (part of the specification process 
necessary to resolve the breakdown). By then calculating what is the resultant 
effect of such a norm set, the method can determine whether a particular user 
is permitted, required, or prohibited to take part in certain stages of the speci- 
fication process. This calculation, however, falls outside the scope of the current 
paper (see [3] for a detailed description) . 

Sect. 2 describes the semantics of composition norm dynamics. The concep- 
tual graph notation of the various categories of knowledge definitions that are 
the output of specification processes are explained in Sect. 3. In Sect. 4, it is 
shown how the norm dynamics can be calculated using a standard conceptual 
graph workbench. The final section contains some discussion and conclusions. 
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2 Composition Norm Dynamics 

The structure of composition norms has been extensively discussed elsewhere 
[3]. We therefore restrict ourselves here to briefly outline their main elements: 

• deontic effect: the intended effect of a norm on the person who is to make 
a composition. A composition is either permitted, required, or forbidden. 

• actor: the role, for example that of editor or reviewer, that a person is to 
play in order to be affected by the norm. 

• control process. This concerns either an initiation, execution, or evaluation 
of the specification process at hand. 

• specification process. In a specification process, a knowledge definition is 
changed. These change processes are either creations, modifications, or termina- 
tions of such definitions. The knowledge definitions themselves can be of four 
different categories: type definitions, state definitions, action norms, and com- 
position norms. The role of the norms was already described in the previous 
section, while type definitions describe ontological knowledge, and state defini- 
tions represent states-of-affairs. For each knowledge definition category, a sepa- 
rate specification process has been defined. Thus, there are twelve customized 
specification processes, such as ‘Create_Type’ or ‘Terminate_State’. The char- 
acteristics of these knowledge definitions and their specification processes have 
been discussed in detail in [4] and [3]. Their conceptual graph notation is pre- 
sented in Sect. 3. 

The formal notation for a composition norm den is the following: 
den = {id,de,a,cp, sp), where id is the identifier of the norm, de is the deontic 
effect, a the actor, cp the control process and sp the specification process. An 
example of such a norm could be: dem = 

(#12, Perm, List-Owner, Exec, Modi fy-Type{M ailing -Li st))^ . This norm says 
that a list owner is permitted to carry out changes in the (functionality) defini- 
tions of a mailing list, for example, by declaring the list to have an open instead 
of closed subscription procedure. 

Example Assume that the set of legitimate composition norms T>cn consists 
of the following definitions: 

— (#58, Perm, Publ-Coord, Init,TerminateState{Reviewer)) 

— (#59, Req, Editor, Exec, Modify-Type{Review)) 

— {#60, Perm, Actor, Control, Specif y {Definition)) 

— (#61, Req, Editor, Control, Create-Type{Edit)) 

— (#62, Forb, Journal-Editor, Eval, CreatC-Type{Edit)) 

— (#63, Req, Reviewer, Init, Create-Type(Edit)) 

^ Note that, for simplicity, in the examples we represent the actor and control process 
entities by their types labels, instead of giving the full entity definition that would 
also include an identifier and a referent. Furthermore, the (nested) structure of the 
specification process is not yet formally defined, this we will do in Sect. 3. The 
predicate stands for the type of specification process, its argument for the knowledge 
definition being changed. 
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— (#64, Perm, Reviewer, Control, CreatePType{Edit -Report)) 

— (#65, Forb, Reviewer, Eval, Create IT ype{Edit -Report)) 

Composition norm #58 indicates that a publication coordinator may start 
the removal of a particular reviewer of a journal. Norm #59 expresses that an 
editor must revise the review process, if prompted. Norm #60 is a very generic 
norm, saying that any actor may control any specification process. Such a generic 
norm is typically defined at the initiation of a network, when the information 
system is still small in scope and only few users and actor roles have been 
defined. Norm #61 says that an editor must control the creation of new types of 
edit processes. However, according to norm #62 a journal editor is not allowed 
to evaluate such newly created process types. This norm could be introduced 
to ensure that such an editor cannot manipulate the results of his own work 
processes. Norm #63 says that a reviewer is responsible for starting the creation 
of a new edit process, for example when he is no longer satisfied with the way 
reviews are being handled. Norm #64 permits a reviewer to fully control the 
creation of report edit process types. Finally, norm #65 says that a reviewer is 
not allowed to evaluate a newly created report edit process definition. Such a 
privilege could instead be granted, for instance, only to the editorial board. 

The types of the various norm elements are ordered using the following type 
hierarchy 

T > 

Definition 
Entity > 

PD_Actor > 

Editor > 

Journal-Editor 
Publ-Coord 
Reviewer 
Control > 

Init 

Exec 

Eval 

Activity > 

Edit > 

Edit-Report 
Review 
Specify > 

Create_Type 
Modify _Type 
Terminate_State 

Composition norms play different roles depending on the users and specifica- 
tion processes they apply to at a particular moment in time. We refer to the way 

^ This hierarchy is formed by the relevant parts of the ontological framework intro- 
duced in [3], combined with some new, example-based types (in italics). Note that 
not all intermediate types are presented here to conserve space. 
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in which the status of norms can change as norm dynamics. These dynamics for 
composition norms can be summarized as follows: at any time, a composition 
norm base contains the set of legitimate norms T>cn- A legitimate composition 
norm is invoked when there is at least one user with whom the norm matches. 
Invoked norms become active if they match with the active specification process, 
which is the process in which a problematic definition is currently to be changed. 
Each specification process consists of three parts: its initiation, execution, and 
evaluation. These parts are called the specification process eompositions, which 
in the case of the active specification process we also refer to as active eompo- 
sitions. For each combination of user and active composition, a set of applicable 
norms exists, which determines what is the acceptable specification behaviour 
for that user and composition. 

These norm dynamics need to be known, because they restrict the sets on 
which norm calculations need to be carried out. This is especially important in 
case of large numbers of norm definitions. In order to model the norm dynamics, 
two matching processes need to be defined. A user match is defined as a match 
between a user and an actor component of some composition norm. This means 
that at least one of the actor roles that the user plays is a subtype of the norm 
actor component. A composition match is defined as a match between a specifi- 
cation process composition and the composition part of some composition norm, 
also called the norm composition part. Such a match implies that the specifi- 
cation process composition must be a specialization of the norm composition 
part, as we support the view that generic norms are stronger than more specific 
norms. Thus, a specification process composition matches with, i.e. is governed 
by some composition norm, if the norm composition part is more generic than 
the specification process composition. In Sect. 4 we show how to calculate this 
match using conceptual graphs. 

Definition 1 

Let there be a composition norm den = {id,de,a,cp, sp) € T>cn- comp is 
a function on the argument of den that produces the norm composition part: 
comp((icra) = (cp, sp). Furthermore, U is the set of users, £ is the set of entities 
{U d £). Function type(e) returns the type, and ref(e) returns the referent of 
entity e. 

Let there be a user u &U and an actor a of some composition norm den = 
{id,de,a,cp, sp) G Dcn- There is a user match between u and a, denoted as 
u 6u a, if 3e £ £ \ ref(e) = ref(u) A type(e) < type(a). 

Let there be some specification process composition comp € Comp (the 
set of all possible compositions, which is the Cartesian product of all con- 
trol processes and all specification processes), and a norm composition part 
covcvp{den) ■ There is a composition match between comp and comp(dcn)) de- 
noted as comp 9n comp((icn), if comp is a specialization of comp(<icri)- There 
is such a specialization if both the control process part and the specification 
process of comp are specializations of their counterparts in corwp{den) ■ A spec- 
ification process is a specialization of another specification process if both the 
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type and the embedded knowledge definition of the first are specializations of 
the second. 

Example The set of legitimate norms T>cn = {#58, . . . , #65} The active spec- 
ification process is the creation of a report edit process type, informally labeled 
CreateJType{Edit -Report). We further assume that the network contains two 
users: John, a journal editor, and Jack, a reviewer. 

For composition norm #60, for instance, the norm composition part comp 
(#60) is {Control, Sped fy {Definition)) . Given were a user John (more pre- 
cisely: ui £ U with type(Mi) = User and ref(Mi) = John)) and an entity Ci 
(with type(ei) = J our nal -Editor and ref(ei) = John)). Say there is an active 
composition 

compi = {Init,Create-Type{Edit -Report)): 

Ui holds, because ref(ei) = ref(ui) A type(ei) < Actor (the norm 
actor) . 

compi 9n comp(#60) holds, because the various parts of comp are special- 
izations of their counterparts in comp (#60). 

We say that a (legitimate) composition norm becomes an invoked composi- 
tion norm if there is a user match between some user and the norm actor. 

Definition 2 

Don -I is the set of invoked composition norms. 

The function Dcnj: 'Pihl) — >■ T^(T’cAf) determines which legitimate compo- 
sition norms are in Dqn-I (where V denotes the powerset): 

Dcn_i(W) = {den = {id, de, a, cp, sp) € Don \ 3u € U : u 9u a} 

Example The set of invoked norms Dqn-I = (#59, . . . , #65}. Legitimate com- 
position norm #58 is not invoked, because there are no users playing roles that 
are subtypes of the Publ-Coord actor role. All the other norms are invoked, be- 
cause there is at least one user playing some role that is a subtype of these norms. 

Whereas the invocation of legitimate norms depends on which users are par- 
ticipating in the community, the actual activation of the invoked composition 
norms depends on the currently active specification process. An invoked composi- 
tion norm is an active composition norm if at least one of the active compositions 
(i.e. the initiation, execution, or evaluation of the active specification process) 
matches with the composition part of the invoked norm. 

Definition 3 

Don -A is the set of active composition norms. 

spa is the active specification process. The set of active compositions CompA 
= {{Init, spa), {Exec, spa), {Eval, spa)}. 

The function Dcn_a : SV -A V{Dcn - i) determines which invoked norms 
are in Dcn-A- 
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DcN_a(sJ5o) = {dcn_i G DcN_I \ 3 COTTiPa G CoTTipA ■ COTTiPa On COmp(dcnj)} 

Example The set of active norms Dqn.a = {#60, . . . , #65} For instance, none 
of the active compositions of the active specification process 
Create-Type{Edit -Report) is a specialization of the norm composition part of 
invoked norm #59, which is therefore not an active composition norm. For all 
other invoked norms, there is at least one active composition of spa which is a 
specialization of the norm composition part. 

Active norms do not have an effect on all users. We call an active composition 
norm applicable to a particular user for a particular active composition if (1) the 
user matches with the actor part of the norm and (2) the active composition 
matches with the norm composition part. 

Definition 4 

DcN_APPL{u,compa) is the Set of applicable composition norms for user u and 
active composition compa- The function Dcn_appl : Id x Comp a 'P{Dcn_a) 

determines which active norms are in Dqn _APPL{ u,compa) for u and compa- 
Dcn_appl(u, compa) = 

{dcn_a = {id,de,a,cp,sp) G Dcn_a \ u 9u a A compa 0„ comp(dcn_a)} 
Example The applicable norm sets are: 

_APPL(JohnJnit_Create_Type{Edit_Report)) ~ {^ 60 ,^ 61 ,^ 62 } 
_APPL{John,Exec_Create_Type(Edit_Report)) (#60, #61} 

N _APPL{John,EvaPCreateAType{Edit_Report)) (#60, #61, #62} 

N -APPL{Jack A nit -Create JType(Edit-Report)) (#60, #63, #64} 

^d)cN _APPL{Jack,Exec_Create_Type{Edit_Report)) ~ {#60, #64} 

^d)cN _APPL{Jack,Eval_Create_Type{Edit_Report)) ~ {#60, #64, #65} 

For example, DcN_APPL{johnjnit_Create_Type{Edit_Report)) Contains activc 
composition norm #60, because (1) there is a user match between John and 
the actor component of this norm (the journal editor role that John plays is a 
subtype of Actor) and (2) there is a composition match between active compo- 
sition 

{I nit, Create-Type{Edit -Report)) and norm composition part 
{Control, Specif y {Definition)) . 

The different norm sets are depicted in Fig. 1. The various subsets depicted 
within the set of active norms represent the different sets of applicable norms 
determined above. In the next section, the informal notation of the knowledge 
definitions that are the object of specification processes is made formal using 
conceptual graph notation. 



3 Specification Knowledge Definition Representation 

Four different categories of specification knowledge definitions are distinguished 
in RENISYS: type definitions, state definitions, action norms and composition 
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Fig. 1. Norm Dynamics Example 



norms. Type definitions are used to represent ontological knowledge, state defini- 
tions represent states-of-affairs, action norms regulate operational workflow be- 
haviour, while composition norms govern the meta-level specification behaviour. 
Since the role of these definitions in the specification process has already been 
discussed quite extensively in [2] and [4], we here only show how they are repre- 
sented in conceptual graphs, along with a simple example. One main advantage 
of using conceptual graphs over, for example, SQL tables and operations, is that 
in this way generalization hierarchies of specification knowledge can be taken 
into account. 

The general graph structure of a specification knowledge definition is: 

[k : de/]. 

Here, A: is a knowledge category, whereas def represents the definition core in 
graph format. As the specific graph representation format of the definitions varies 
for the various knowledge categories, their formats are discussed separately. 

3.1 Type Definitions 
Definition 5 

A type definition dt = {id,td,tg, E, R) € T>t, with td the defined type, tg the 
genus type, E a set of entities, and R the set of relations connecting them, is 
defined as: 

[Type : [td : *x] ->• (Def) ->• [tg : *x] dif(dt)]. 

dif(dt), the differentia of the type definition, is connected to the genus con- 
cept [tg : *x] by its relations that have the genus placeholder X in their source or 
destination concepts. Thus, the differentia forms a subgraph that specializes the 
genus to the defined type. This representation of the type definition is different 
from the one used by Sowa (1984,p.l06). The source entity of the (De/)-relation 
denotes the defined type, the destination entity the genus type. 
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Example A (partial) type definition of the report-editing process could be: 

[Type : [Edit_Report : *x] — >■ (Def) — >■ [Edit : *x[ — 

(Matr) — >■ [DraftJleport] 

(Rslt) — >■ [Edited_Report]]. 



3.2 State Definitions 
Definition 6 

A state definition ds = {id, E, R) G T>s is defined as: 

[state : def]. 

def is the conceptual graph formed by the concepts in E linked by the rela- 
tions in R. To be meaningful, state definitions need to be circumscribed by the 
available type definitions. 

Example The state definition that says that Harry is the list owner of the 
CG-mailing list is represented as: 

[state : [List_Owner : ^^Harry] — >■ (Poss) — >■ [Mailing_List : ^CG]]. 



3.3 Action Norms 
Definition 7 

An action norm dan = {id^ de, a, cp, w) G Dan is represented as follows: 

[an : a (Agnt) ^ cp ^ (Obj) -A- it]. 

de is the deontic effect of the norm. a,cp,w stand for some actor^, control 
process, and workflow, respectively. The norm category label an is: 

{ Perm_Act if de = Perm 

Req_Act if de = Req 

Forb_Act if de = Forb 

Example The graph representation of the action norm that says that an editor 
may carry out the editing process is: 

[Perm_Act : [Editor] (Agnt) [Exec] — >■ (Obj) — >■ [Edit]]. 



3.4 Composition Norms 

Definition 8 A composition norm den = {id, de, a, cp, sp) G Dcn is represented 
as: 

[cn : a (Agnt) cp ^ (Obj) dp ^ (Rslt) — >■ def]. 

Here, de, a, cp, sp mean the deontic effect, actor, control process, and speci- 
fication process, dp = [type(sp)] and def = ref(sp). The composition category 

® The meaning of the term actor is different from its interpretation in CGT, where it 
refers to a node in a dataflow graph that can perform computations on the declarative 
graph knowledge [9, p.l88] 
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label is: 



Perm_Comp 

Req_Comp 

Forb_Comp 



if de = Perm 
if de = Req 
if de = Forb 



Example Composition norm #58 is represented as: 



[Perm_Comp : [Publ_Coord] •(— (Agnt) <r- [Init] — >■ (Dbj) — 
[Terminate_State] — >■ (rslt) — >■ [State : [Reviewer]]]. 



4 Norm Dynamics Calculation with Conceptual Graphs 

This section illustrates how conceptual graph theory can be used to calculate 
the norm dynamics discussed in Sect. 2. To this purpose we use the Peirce con- 
ceptual graphs workbench^, which was developed by Gerard Ellis. Among other 
things, the Peirce tool allows for the handling of nested graphs, which are needed 
to represent composition norms^. 



Type Hierarchy 

The knowledge base has been loaded with the type hierarchy described in 
Sect. 2. These type definitions are represented as follows: 

Editor < PD_Actor. 

Journal_Editor < Editor. 

Publ_Coord < PD_Actor. 

Reviewer < PD_Actor. 

[...] 

State Definitions 



The Peirce knowledge base contains these state definitions to describe that 
user John is a journal editor and Jack a reviewer: 

[State: [User: #John]]. 

[State: [User: #Jack]]. 

[State: [Journal_Editor : #John]]. 

[State: [Reviewer: #Jack]]. 



Composition Norms 

The set T>cn consists of legitimate composition norms #58-#65. These 
norms are represented, in order, as: 

^ htt p : / / WWW. cs.adelaide.edu.au/users/peirce/ 

® As in Peirce the symbols and cannot be used in the referent, they are both 
replaced by a The ‘> ’-symbol indicates the prompt. A further explanation of the 
precise syntax of commands and graphs is given in [5] and is not repeated here. 
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[Perm_Comp: [Publ_Coord] <- (Agnt) <- [Init] -> (Obj) - 
-> [TGrminate_StatG] -> (Rslt) -> [StatG: [RGviGWGr] ] ] . 

[RGq_Comp: [Editor] <- (Agnt) <- [Exgc] -> (Obj) - 
-> [Modify_TypG] -> (Rslt) -> [TypG; [RGviGw]]]. 

[PGrm_Comp: [Actor] <- (Agnt) <- [Control] -> (Obj) - 
-> [SpGcify] -> (Rslt) -> [DgI inition] ] . 

[RGq_Comp: [Editor] <- (Agnt) <- [Control] -> (Obj) - 
-> [CrGatG_TypG] -> (Rslt) -> [Typo: [Edit]]]. 

[Forb_Comp: [Journal_Editor] <- (Agnt) <- [Eval] -> (Obj) - 
-> [CrGatG_TypG] -> (Rslt) -> [Typo; [Edit]]]. 

[RGq_Comp: [RGviGWGr] <- (Agnt) <- [Init] -> (Obj) - 
-> [CrGatG_TypG] -> (Rslt) -> [Typo; [Edit]]]. 

[PGrm_Comp: [RGviGWGr] <- (Agnt) <- [Control] -> (Obj) - 
-> [CrGatG_Typo] -> (Rslt) -> [Typo: [Edit.Roport] ] ] . 

[Forb_Comp: [RGviGWGr] <- (Agnt) <- [Eval] -> (Obj) - 
-> [CrGatG_Typo] -> (Rslt) -> [Typo: [Edit.Roport] ] ] . 

Invoked Norms Calculation 

For each composition norm den G T^cn, den is in the set of invoked compo- 
sition norms Dcn_i if there is a user match u9ua, with u being some user and 
a the actor part of den (see Def.l and 2). 

The (temporary) set of user referents Ru consists of the referents of the 
user concepts in state definitions of users. These definitions are retrieved by the 
following operation. 

> (SpGcialisations) -> [[StatG: [UsGr]]]? 

[StatG: [UsGr: #John]]. 

[StatG: [UsGr: #Jack]]. 
truG 



Rjj = John, ^ Jack} 

Now, Vdcra G Ron 1 with a the type label of the actor part of den- if 3 € Ru 

such that there is a specialization of [State : [a : r„]j, then den G Dqn-I- 
For example, for composition norm #58 (with a = PublJJoord)'. 

> (SpGcialisations) -> [[StatG: [Publ_Coord: #John]]]? 
no spGcializations 

truG 

> 

> (SpGcialisations) -> [[StatG: [Publ_Coord: #Jack]]]? 
no spGcializations 

truG 

> 



Thus, composition norm #58 ^ Dcn_i- 

On the other hand, for composition norm #59 (with a = Editor): 

> (SpGcialisations) -> [[StatG: [Editor: #John]]]? 

[StatG: [Journal_Editor : #John]]. 

truG 

> (SpGcialisations) -> [[StatG: [Editor: #Jack]]]? 
no spGcializations 

truG 



Thus, since John is affected by it, composition norm #59 G Dqn-I- Similar 
calculations can be made for the other norms. It can thus be derived that Dcn_i 
consists of composition norms #59-#65. 
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Active Norms Calculation 

An invoked norm is also an active norm if there is a composition match 
between one of the active compositions and the norm composition part (see 
Def.3). 

In order to calculate the active norms, several temporary graphs are needed: 

• Every active composition compa € CompA is stored in a separate graph. 

• For each invoked composition norm dcn_i G Dcn^Ii its norm composition 
part is stored in a new graph. Now, VdcnA G Dcn-I '■ den a G Dcn-A if at least 
one of the active composition graphs is a specialization of the norm composition 
part graph. For example: 

• The active composition graphs are: 



[Init] -> (Obj) -> [Create_Type] -> (Rslt) -> [Type; [Edit_Report] ] . 
[Exec] -> (Obj) -> [Create_Type] -> (Rslt) -> [Type: [Edit_Report] ] . 
[Eval] -> (Obj) -> [Create_Type] -> (Rslt) -> [Type: [Edit_Report] ] . 



• For invoked composition norm #59, the norm composition part graph is: 



[Exec] -> (Obj) -> [Modify_Type] -> (Rslt) -> [Type: [Review]]. 



Performing the specialization operation gives the following result: 



> (Specialisations) -> [[Exec] -> (Obj) -> [Modif y_Type] -> (Rslt) - 
-> [Type: [Review]]]? 

[Exec] ->(Obj) -> [Modif y_Type] -> (Rslt) -> [Type : [Review]] . 
true 

> 



Since the operation only returns the norm composition graph itself, and none 
of the active composition graphs, composition norm #59 ^ Dcn^a 

• For invoked composition norm #60, however, the norm composition part 
graph is: 

[Control] -> (Obj) -> [Specify] -> (Rslt) -> [Definition]. 



For this graph, the specialization operation returns: 



> (Specialisations) -> [[Control] -> (Obj) -> [Specify] -> (Rslt) - 
-> [Definition]]? 

[Exec] ->(0bj) -> [Create_Type] -> (Rslt) -> [Type : [Edit_Report] ] . 

[Init] ->(0bj) -> [Create_Type] -> (Rslt) -> [Type : [Edit_Report] ] . 
[Control] ->(0bj )-> [Specify] ->(Rslt) -> [Definition] . 

[Eval] ->(0bj) -> [Create_Type] -> (Rslt) -> [Type : [Edit_Report] ] . 
true 

> 



At least one (in fact, all three) of the active composition graphs are returned, 
so composition norm #60 G Dcn^a- Similar calculations can be made for the 
other norms. Dcn.a consists of composition norms #60-#65. 



Applicable Norms Calculation 
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For each combination of user u and active composition compa, a set of ap- 
plicable norms DcN_APPL(u,compa) i® defined (see Def.4). An active norm is in 
such a set if there are both a user match between u and the actor part of the 
norm, and a composition match between comp a and the norm composition part. 



For example, to calculate Dqn _APPL( u,compa)i with u = John and compa = 
[Init]— > (Obj)— > [Create_Type]— > (Rslt)— > [Type : [EditJReport]]: 

• To see whether composition norm #60, which has Actor as the type label 
of its actor part, is in this set, we first need to determine whether there is a user 
match: 

> (Specialisations) -> [[State; [Actor]]]? 

[State: [User: #John]]. 

[State: [User: #Jack]]. 

[State: [Journal_Editor : #John]]. 

[State: [Reviewer: #Jack]]. 
true 



Thus, there is indeed a user match between John and the actor part of norm 
#60. Now, it must be seen if there is a composition match as well. 

The norm composition part graph for norm #60 is: 

[Control] -> (Obj) -> [Specify] -> (Rslt) -> [Definition]. 



The matches with the active composition graphs are: 

> (Specialisations) -> [[Control] -> (Obj) -> [Specify] -> 

(Rslt) -> [Definition]]? 

[Exec] ->(0bj )-> [Create_Type] -> (Rslt) -> [Type : [Edit_Report] ] . 

[Init] ->(0bj) -> [Create_Type] -> (Rslt) -> [Type : [Edit_Report] ] . 
[Control] ->(0bj )-> [Specify] ->(Rslt) -> [Definition] . 

[Eval] ->(0bj) -> [Create_Type] -> (Rslt) -> [Type : [Edit_Report] ] . 
true 



Thus, compa is in the set of results, and the composition match is therefore 
successful. Since both the required user and composition match exist, composi- 
tion norm #60 

^ ^CN _APPL{John, [Init] — >{Ohj) — >[Create_Type] — >(Rslt)—>[Type: [Edit_Report]]) • 

Similar calculations can be made for the other composition norms in this 
set, as well as for the other applicable norm sets. A more efficient calculation 
would reuse the results of the user matches done in the calculation of the invoked 
norms, and the composition matches done for the active norms calculation. For 
clarity, the specialization operations were repeated here, however. 



5 Discussion and Conclusions 

Existing specification approaches are not very well-suited for network informa- 
tion system specification for virtual professional communities, since these re- 
quire a legitimate user-driven approach. Traditional waterfall-based specification 
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methods, such as SDM, are quite rigid and depend to a large extent on exter- 
nal analysts controlling the specification process [1]. Other methods, notably 
those based on a socio-technical paradigm, such as Soft Systems Methodology 
or ETHICS, assign a more prominent role to active user participation in the 
specification process [11,6]. However, they still do not adequately support evolu- 
tionary systems development and are indiscriminate in which users to in involve 
in what particular specification change processes. 

In this paper, we have demonstrated how to calculate the relevant group of 
users to involve in a particular specification change process. To this purpose, a 
user facing a breakdown in his work can identify problematic knowledge defini- 
tions, which he or she would like to see changed. Composition norms are essential 
to precisely regulate the specification processes needed to resolve these problem- 
atic definitions. They describe the meta-level change behaviour. This in contrast 
with numerous workflow modelling methods, either activity-based (i.e. specify- 
ing logistical workflows), or conversation-based (modelling communications and 
commitments) that do not provide guidelines on who is to change what [7]. In [3] 
we describe how to determine the resultant deontic effect of a set of applicable 
norms, which states whether a particular user is ultimately permitted, required, 
or forbidden to control (i.e. initiate, execute, or evaluate) a particular specifica- 
tion process. This, among other things, requires for occurring norm conflicts to 
be resolved, which we have done making use of work done in dynamic deontic 
logic such as described in [12]. The actual change process is a form of a conversa- 
tion by the selected users from the relevant user group. A Specification Process 
Model, based on Van Reijswoud’s speech-act theory-based Transaction Process 
Model [10], prescribes the conversational moves that the various users can make. 
A prototype web server with mail functionality has been developed that can be 
used to support the specification process of a restricted set of knowledge defi- 
nitions. Several case studies have been done that demonstrate how this support 
can facilitate network evolution. The still limited functionality of the tool will 
soon be upgraded to provide robust support for the full specification process. 

Conceptual graph theory provides the theoretical constructs and tools to 
allow for such specification knowledge to be represented in a concise way and 
for the necessary calculations to be carried out efficiently. In [2], the importance 
of finding new applications such as these for CGT was discussed. We have now 
concretely demonstrated how existing tools such as the Peirce conceptual graph 
workbench can be applied to supporting the legitimate user-driven specification 
process. Of course, much work still needs to be done on optimizing the algorithms 
used, and on the integration of standard conceptual graph tool functionality with 
the RENISYS tool. These optimization and integration problems is the subject 
of current and future research. 
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Abstract. In [11], we presented PROLOG++, a CG-based conceptual and 
contextual extension of PROLOG. This paper discusses some limitations of 
PROLOG++ and presents a more expressive, efficient and uniform version, 
called PROLOG+CG which overcomes the limitations. PROLOG+CG is a CG 
object-oriented logic programming language suited for many CG applications. 
Contrary to PROLOG++ which has heen implemented in C-Prolog, 
PROLOG+CG is implemented directly in JAVA 2. Thus and thanks to the 
change of the implementation, PROLOG+CG incorporates CG (both simple 
and compound) as a basic data structure, beside term and list. Other 
possihilities, discussed in this paper, are offered hy PROLOG+CG due to its 
direct implementation in JAVA 2. 



1. Introduction 

To achieve more expressive power, the PROLOG language has been extended in at 
least two directions : 

• Conceptual extension : a goal can be represented by a term (a predicate) or by a 
complex structure, like typed feature structure [1, 2, 3, 10, 18]. 

• Contextual extension : it is illustrated first by object-based PROLOG [14, 16, 4, 5] 
where a Prolog program is partitioned into objects (or worlds, theories, modules, 
bases, spaces or other similar terms), each object contains a set of rules. A goal is 
then resolved in the context of a specific object. 

Contextual extension of PROLOG is illustrated also by object-oriented PROLOG 
[7, 15, 17] where inheritance between objects is the norm. 

In the Conceptual Graph community, some systems [6, 8, 9] incorporated a deductive 
component that interprets a set of rules, all the goals of a rule are represented by 
simple Conceptual Graphs (CG). These components do not subsume PROLOG and 
were not presented as extensions of PROLOG. 
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To fulfill this gap in the CG research (i.e., a need for a CG based extension of Prolog), 
we proposed in [11, 12] PROLOG++ ; a conceptual and contextual extension of 
PROLOG. 

This paper discusses some limitations of PROLOG++ and presents a more expressive, 
efficient and uniform version, called PROLOG+CG which overcomes the limitations 
(the name of the language is changed from PROLOG++ to PROLOG+CG to avoid 
the conflict with the PROLOG++ of Moss [17]). 

The beta version of PROLOG+CG is available from the site : 

WWW. insea. ac.ma/CGTools/PROLOG+CG.htm 

The paper is organized as follows : section 2 gives a brief review of PROLOG++ with 
a description of some of its limitations. PROLOG+CG, the new version of 
PROLOG++ which overcomes these limitations, can be depicted by the following 
equation : PROLOG+CG = J -PROLOG + CG + Object-Based + Inheritance. 

Section 3 presents the core of PROLOG+CG : J-PROLOG with CG, section 4 
introduces the object level while inheritance is discussed in section 5. 

Section 6 gives an outline of current and future works concerning PROLOG+CG. We 
then conclude the paper. 



2. Brief Review and Limitations of PROLOG++ 

A PROLOG++ program [11, 12] is composed of a declarative knowledge base DKB 
(type hierarchy + conceptual structures), a strategic knowledge base SKB (a set of 
objects) and standard Prolog rules. Each rule in an object is prefixed by a term which 
represents the descriptor of the object. CG is used basically in DKB to represent 
conceptual structures and it is used in SKB to represent the structural part of the 
object, the method definition and the method invocation. Here is an example of an 
object rule in PROLOG++ : the rule is prefixed by the descriptor “seriousDisease”, its 
head is a CG and its tail begins with a message to the object “disease”, the contain of 
this message is a CG. 

seriousDisease : : { [sensitive] -pat- > [person : P] 

obj -> [dispute] } 

disease : : { [gotDisease] -obj - > [diseaseCardiac] 

pat-> [person: P] -fatherOf-> [man: M] & 
[beingDead] -pat- > [man : M] 

cause- > [diseaseCardiac] } , 
attrindividu ::{ [person : P] -withAge- > [age : A]}, 
validateValue (A) , 

Please, notes that a goal in the tail of an object rule or in a standard Prolog rule can be 
a message or a term. CG unification is used when the goal is a message. 

CG is considered in PROLOG++ as a “pseudo” data structure; a PROLOG++ 
program is translated to an equivalent C-Prolog program and a CG is translated to a 
list structure. A term that has a CG as an argument considers this later as a list, except 
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for CG operations (MaximalJoin, Generalize, Subsume, etc) which are provided by 
PROLOG++ as primitive terms/goals and which treat their arguments as CG. 

Let us consider now some limitations of PROLOG++ : 

• CG is not considered as a data structure of PROLOG. As explained above, when a 
CG is used as an argument of a term (other than a primitive CG operation), it is 
considered as a list. Term unification has not been extended to apply CG 
unification when the corresponding arguments of the terms to be unified are CG. 

• A goal that is not a message, can be represented by a term only. 

• Inheritance among objects was not considered. 

• The interpreter of PROLOG++ has been implemented using C-Prolog. A 
PROLOG++ program is rewritten in an equivalent C-Prolog program. 
PROLOG++ is thus dependent upon C-Prolog. 

To overcome these limitations we have developed our own version of PROLOG : 
J-PROLOG. J-PROLOG interpreter and environment are implemented in JAVA 2, 
assuring thus its portability. 

Then, J-PROLOG has been extended by adding CG, object and inheritance. The result 
of these extensions of J-PROLOG is called PROLOG-tCG. 



3. The Core of PROLOG-hCG : J-PROLOG with CG 

The core of PROLOG-tCG is J-PROLOG with the addition of CG as a basic data 
structure, beside Term and List. Thus, elementary data of PROLOG-tCG are 
Numbers, Booleans, Identifiers and Strings. Composed data are Lists, Terms and 
CGs. An argument of a term, an element in a list or a value of a concept in a CG can 
be an elementary or a composed data. 

The head of a rule can be a term or a CG. Finally, a goal in the tail of a rule can be 
represented by a term, a CG or a variable. See appendix 1 for the syntax of 
PROLOGh-CG and appendix 2 for a brief description of its environment. 

CG in PROLOGh-CG. A CG g in PROLOGh-CG has the following characteristics : 

• g is functional : any concept of g has mutually exclusive incoming relations and 
mutually exclusive outgoing relations. 

• g can be simple or compound. 

• A concept of g has the following structure : 

[Type : Referent =Value] like [Age : My Age =36[ 
where Type stands for a specific concept type or for a variable. Referent (optional) 
stands for an individual identifier, a "multi-referent" or a variable. Value 
(optional) stands for a PROLOGh-CG data (including a list, a term or a CG) or for 
a variable. A mutual-referent, which has the form ‘*Number’, is used in the linear 
notation of a CG to identify several occurrences of the same concept. 

• Relations in g are dyadic. 

The use of CG requires the definition of a hierarchy of concept types and the 
declaration of instances of types (particulars). PROLOGh-CG offers for that purpose 
specialization rules and instantiation rules. 
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Specialization rule. A specialization rule describes for a type TypeO its immediate 
subtypes Typel, Type2, TypeN. It has the following form : 

TypeO > Typel, Type2, TypeN. 

Example : 

Universal > Person, Action, Object, Attribute. 

Currently, PROLOG+CG provides three primitive operations for concept types 
hierarchy : 

■ subType (Typel , Type2 ) which verifies if Typel is a subtype of Type2. 

■ maxComSubType (Typel , Type2, Type3) which verifies that (or looks 

for) Type3 is the maximal common subtype of Typel and Type2. 

■ minComSuperType (Typel , Type2 , Type3) which verifies that (or looks 

for) Type3 is the minimal common supertype of Typel and Type2. 

Instantiation rule. An instantiation rule specifies for a type TypeO its instances Instl, 
..., InstN : 

TypeO = Instl, ..., InstN. 

Example : 

Man = Jo, Mark. 

Here is a PROLOG+CG program that makes use of CG, specialization rules and 
instantiation rules. 

// Specialization rules that describe the concept 
// types hierarchy 

Universal > Person, Action, Object, Attribute. 

Object > House, Restaurant, Walnut, Shell, Spoon. 
Attribute > Classical, Age, Easily. 

Person > Man, Woman. 

Action > Perform, Go, Work, Buy, Eat, Search. 

// Instantiation rules that describe the instances 
// of some types 
Man = Jo, Mark. 

Woman = Mary, Jane. 

// Inference rules. Notes that a goal can be a term or 
// a CG. "x" , "w" and "a" represent variables. 

goodSister (x) : - 

employee (x) , [Woman : x] -attr- > [Classical] . 

[Woman : w] -attr- > [Classical] :- 
[Work] - 

-near-> [House] -poss-> [Woman: w] -ageOf - > [Age = a], 
-agnt-> [Woman : w] , 
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inf (a, 40) . 

//An example of a fact described by a CG. This 
// later contains a multi-referent "*l" to identify 
// the concept [House : *1] . 

[Work] - 

-agnt- > [Person : Jane] - 

-ageOf->[Age = 30], 

<-poss- [House : *1] <-nearOf- [Restaurant];, 
-near- > [House : *1] . 

// Examples of facts described by terms. 
employee (Mary) . 
employee (Jane) . 

//An example of compound CG 
[Person] <-agnt- [Perform] -obj -> 

[Action = [Eat] - 

-ob j -> [Walnut = wal2] -part-> 

[Shell : myShell = toto] , 
-instr-> [Spoon] -matr-> [Shell : myShell] 

] -manr-> [Easily] . 



// Examples of facts described by terms that 
// contain CG as their arguments. 
sense ( "extract " , [Search] - 

-agnt- > [Person] , 

- from- > [Book] , 

-obj - > [Information] ) . 

sense ( "classical woman", [Woman] -attr- > [Classical] ) . 

Some requests concerning the above program are the following : 

?- goodSister (x) . 

{x = Jane} 

?- goodSister (Mary) . 
no . 

?- [Woman : x] -attr- > [Classical] . 

{x = Jane} 

?- [Man] <-agnt- [Perform] -obj -> [Action = 

[Eat] -obj -> [Walnut] -part- > [Shell : x] ] . 

{x = myShell} 
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?- sense ( "extract" , g) . 

{g = [Search] - 

-agnt- > [Person] , 

- from- > [Book] , 

-obj - > [Inf ormation] } 



?- sense ( "extract" , 

[x] <-obj - [Action] -agnt-> [Person] ) . 
{x = Information} 



?- sense ( "extract" , 

[x] <-obj - [Work] -agnt-> [Person] ) . 



no . 



// search the CG associated to "classical woman" 
// and try to satisfy it (as the next goal of 
// the request) . 

?- sense ( "classical woman", g) , g. 

{g = [Woman] -attr- > [Classical] } 



The unification operation which is used by the PROLOG+CG interpreter has been 
extended to account for the use of CG as a new basic data structure. Thus, when the 
two elements to unify are CG, CG unification is called. 

CG unification. Unify the CG gl with the CG g2 implies to verify that gl is a sub- 
graph of g2 provided the unification of the corresponding concepts works [6]. A 
concept cl of gl can be unified with a concept c2 of g2 if the type, the referent and 
the value of cl can be unified with the type, the referent and the value of c2 
respectively. Two specific types tl and t2 can be unified if they have a maximum 
common subtype t3 other than ‘Absurd’. The result of referent unification should be 
conform to type t3. Since a concept value is a PROLOG+CG data, concept value 
unification corresponds to PROLOG+CG unification. 

The CG unification algorithm adopted in PROLOG+CG can be described briefly as 
follows : 
unify(gl,g2) = 

1. Verify that all the relations of gl are also relations in g2; /** this is a weak but 

useful constraint. For instance, if gl contains a relation that is not used in 
g2, then we can conclude that the two CG can not be unified. **/ 

2. Determine an entry point for gl and g2 : 

if gl and g2 contain respectively two concepts cl and c2 that have the same 
specific referent, 

then consider cl and c2 as the two entry points for gl and g2. In this case, 
unify(gl, g2) will be deterministic, without backtracking, 
else search two identical relations in g 1 and g2 and consider them as the two 
entry points for gl and g2 (or, in terms of concepts, consider the concept 
sources of the two relations as entry points). unify(gl, g2) could backtrack 
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in this case since gl and g2 can contain several relations with the same 
identifier. 

3. Starting with the unification of the two entry points, propagate the unification to 
the two graphs gl and g2. 

4. Check that gl is a sub-graph of g2 : all the concepts and branches of gl have 
been unified with concepts and branches of g2. 

5. if the unification of gl and g2 fails and entry points were determined by the 

search of identical relations in gl and g2 
then backtrack to search for another identical relations (and restart from 3). 

Please notes that since CG unification is integrated to Prolog unification, constraints 
that result from CG unification influence the rest of the Prolog unification and vice 
versa. 



4. Objects in PROLOG+CG 

Object in PROLOG+CG. An object is a set of “inference rules” prefixed by terms 
with an identical signature. An object has the following form : 

T1 : :R1 
T2 : :R2 



Tn: :Rn 



Where Tl, T2, ..., Tn represent terms with the same signature and Rl, R2, ..., Rn 
stand for PROLOG+CG inference rules. The common signature of Tl, T2, ..., Tn 
constitutes the descriptor of the object. 

An inference rule in PROLOG+CG. An inference rule, in an object or in the 
outmost context (i.e., the program), has one of the following two forms : 

G. 

Gh Gl, G2, ..., Gn. 



where : 

- G and Gh represents a goal that can be a term or a CG. It can also be a variable if 

the rule is inside an object. 

- Gl, G2, . . . and Gn represent goals, each one can be a term, a CG, a variable or a 
message (see the next definition). 

Sending a message. Sending a message to an object is expressed by a composed 
goal : T : : G, where T represents a term and G a goal (i.e., which could be a term, a 
CG or a variable). T : : G can be read : “send a message to the object to execute 
(satisfy) the goal G”. The descriptor of the object is the same as the signature of T. 
To respond to a message T : : G, the interpreter locates first an object with a descriptor 
that has the signature of the term T. Then it searches, inside the object, a rule 
Ti : : Ri such that T can unify with Ti and G can unify with the head of the rule Ri. 




From PROLOG++ to PROLOG+CG 547 



Remark. A rule can be prefixed also by a CG : CG : : R. In this case, all the rules of 
the program that are prefixed by CG constitute one object. Also and as a consequence, 
a message can have the form : CG : : Goal. 

The following program illustrates the use of objects in PROLOG+CG : an object with 
the descriptor hamza is defined. The object contains a fact that describes some 
attributes of hamza and a rule that describes a method to compute the age of hamza. 

hamza: : [PERSON] -DateOf Birth- > [BIRTH] -ptime-> 

[DATE = (5, 04, 1995) ] . 



hamza : : Age (A) : - 

currentDate (Dl) , 

hamza: : [PERSON] -DateOf Birth- > [BIRTH] -ptime-> 

[DATE = D2] , 

diffDate(Dl, D2 , A). 

currentDate ( (14 , 12, 1999)). 

dif fDate ( (x_Day2 , y_month2, z_year2), 

(x_Dayl, y_monthl, z_yearl) , 

(x_Day, y_month, z_year) ) :- 

val (x_Day, sub(x_Day2, x_Dayl) ) , 
val (y_month, sub (y_month2 , y_monthl) ) , 
val(z_year, sub(z_year2, z_yearl) ) , /. 

An example of request (i.e., send the message Age(x) to hamza) : 

?- hamza :: Age (x) . 

{x = (9, 8, 4) } 

Here is another object-based program that illustrates how conceptual structures for a 
concept type can be encapsulated within an object. It shows also how they can be 
used. 

Universal > Animate, Inanimate, Action. 

Action > Extract. 

Animate > Person. 

Person > Student, Employee. 

Student > ResearchAssistant . 

Employee > ResearchAssistant. 

Inanimate > Text . 

Text > Book. 

// Conceptual structures for the type Extract 
// constitutes an object. 

Extract (canon) : : [Extract] - 

-agnt-> [Person] , 

-obj -> [Inanimate] . 
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Extract (schema) : : [Extract] - 

-agnt-> [Person] , 

-obj - > [Text] , 

-target- > [Book] . 

Extract (schema) : : [Extract] - 

-agnt-> [Person] , 

-obj -> [Inanimate : *1], 

-manr-> [Strong] , 

-target-> [Inanimate] -on-> 

[Inanimate : *1] . 

Consider now the next rule : it checks if the given 

information in G can be unified with a schema for the 
given type v_type : 

first, it creates a term v_term = v_type (schema) from 
the list (v_type, schema) , 

second, it searches a schema for the type v_type that 
can be unified with G. 

checkSchemas (v_type) : :G :- 

term_list (v_term, (v_type, schema)), 
v_term : : G . 

Let us consider now some requests concerning the above program : 

// search all the schemas for the type Extract : 

?- Extract (schema) : :G. 

{G = [Extract] - 

-agnt-> [Person] , 

-obj - > [Text] , 

-target- > [Book] } 

{g = [Extract] - 

-agnt-> [Person] , 

-obj -> [Inanimate] <-on- [Inanimate : *1], 
-manr-> [Strong] , 

-target- > [Inanimate : *1]} 

// Is the information [Extract] -target- > [Inanimate] 

// contained in one of the Extract schemas ? 
?-checkSchemas (Extract) : : [Extract] -target- > [Inanimate] . 
{} 

{} 



?- checkSchemas (Extract) : : [Inanimate] <-from- [Extract] 

-obj - > [Person] . 



no . 
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5. Object Inheritance in PROLOG+CG 

Object inheritance in PROLOG+CG is based on the approach proposed by McCabe 
[15]. Inheritance between objects is defined by inheritance rules. 

Inheritance rule. It has the following form : 

Terml <- Term2 . 

where Terml and Term2 are two terms that represent two objects. The above 
inheritance rule means : the object identified by Terml is a specialization of the 
object identified by Term2. If a message that is sent to an object cannot be satisfied, 
the interpreter will search an inheritance rule for that object (if it has one) in order to 
delegate the message to its super-object. 

Here is an example that illustrates object inheritance in PROLOG+CG. 

Universal > Form, Attribute. 

Form > Rectangle. 

Rectangle > Square. 

Attribute > Perimeter, Surface, Width, Heigth, Beautiful . 

Rectangle (H, W) : : [Rectangle] - 

-permOf- > [Perimeter] , 

-surfOf-> [Surface] , 

-widthOf- > [Width = W] , 

-heigthOf- > [Heigth = H] 

not (inf (W, H) ) . 

Rectangle (H, W) : : Perimeter (P) 
val(P, mul(2, add(H, W) ) ) . 

Rectangle (H, W) :: Surface (S) 
val (S , mul (H, W) ) . 

// An inheritance rule. 

Square (C) <- Rectangle (C, C) . 

Square (_) : : [Square] -attr-> [Beautiful] . 

Some requests concerning the above program are : 

?- Rectangle (4 , 5 ) : : Perimeter (P) . 

{P = 18} 

?- Square (4) : : Perimeter (P) . 

{P = 16} 

The primitive goal Createlnstance. PROLOG+CG provides the primitive goal 
Createlnstance ( Ident , Term) which enables the creation of an instance 
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Went from an object identified by Term. When satisfied, createInstance(Went, Term) 
will add the following inheritance rule to the program : Ident < - Term. 

The following request illustrates the use of createlnstance (assuming the above 
program) : 

?- createlnstance (sql , Square (6)), sql :: Surface (S) . 

{S = 36} 



6. Current and Future Works 

PROLOG+CG constitutes a promising platform for many kinds of applications. 

However in order to a more powerful and operational platform, current and future 

work is planned to extend PROLOG+CG and its environment at different levels. 

Current Works 

We expect to realize the following extensions before august 2000 : 

■ At the language level and especially the primitives level : incorporating other data 
structures (and related operations) like Vector, Hashtable and multi-media data 
(text, html, image, sound, video). 

Also, current works concern the various operations on simple and compound CG : 
MaximalJoin, Specialize, Generalize, Subsume, Analog, ExpandConcept and 
Contract. 

■ At the interpreter level : incorporating the “Expert System Mode”. If this mode is 
activated, the interpreter, when it attempts to resolve an unknown goal, should ask 
the user for its truth value. Also, it should be able to justify its behavior; respond 
to the Why/How questions. 

■ At the interface level : replacing the text editor of PROLOG+CG environment by 
an hyper-text like editor. This later will be useful when data is a Vector, a 
Hashtable or a multi-media data. Also, the editor will enable a 
compression/decompression of a context : if the value of a concept is a CG, this 
later can be hidden (by a compression operation) or shown (by a decompression 
operation). 

Another current work concerning the interface is the development of a debugger 
with trace facilities. 

Future Works 

Some future works include : 

■ Complete the “Expert System Mode” by adding forward-chaining. 

■ Extend PROLOG+CG to enable constraint-based programming. 

■ Incorporate the “knowledge base dynamic formation algorithm” [12, 13]. With 
the integration of this component, PROLOG+CG could be used as a memory- 
based language and as a general-purpose Case-Based Reasoning Tool. 

■ Extend PROLOG+CG with 3D and animation facilities (lava 3D will be explored 
for that purpose). 

■ Extend PROLOG+CG to account for concurrent and multi-thread programming. 
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■ Extend PROLOG+CG so that it can be used as a platform for the development of 
multi-agent systems. 

■ Extend PROLOG+CG to enable connections with some data bases systems. 

■ Incorporate the web facilities. 

■ Develop a wizard for the conception of applications that include interface units 
like menu, tool bars, button, text field, etc. 

Of course, this list is not exhaustive and this is meant as an invitation to anyone who 
is willing to work in order to reduce this list! 



7. Conclusion 

This paper discusses some limitations of PROLOG++ [11, 12] and presents the new 
version, PROLOG+CG, which manages to overcome them. The basic limitations and 
restrictions were due to the fact that PROLOG++ has been implemented with C- 
Prolog and a PROLOG++ program is transformed to an equivalent C-Prolog program. 
Thus, no change has been made to C-Prolog (the language and its interpreter) in order 
that we may include for instance CG as a basic data structure (beside Terms and Lists) 
and so as to extend C-Prolog unification accordingly. 

By developing our own version of Prolog, called J-PROLOG, we were able to modify 
the interpreter (including the unification operation) in order to give CG a status 
similar to that of Terms; as a basic data structure and as a representation of a goal. A 
PROLOG+CG program is directly “compiled” into an internal representation in terms 
of objects and JAVA structures (vector, hashtable, etc.). 

With these changes, PROLOG++ has been reviewed and reformulated into a more 
expressive, efficient, simple and orthogonal language. 

To assure the portability of the language, the implementation of PROLOG+CG 
(environment + interpreter) has been done with JAVA 2. 

Appendix 1 : The Grammar of PROLOG+CG 

Prolog+CGProgram = (Rule | Comment) { (Rule | Comment) } . 

Rule = Specialization_Rule | Instantiation_Rule | 
Generalization_Rule | Inf erence_Rule . 
Specialization_Rule = Typeldentif ier ">" Typeldentif ier 

Typeldentif ier } 

Instantiation_Rule = Typeldentif ier "=" 

Ref erent Identifier 

Ref erentidentif ier } 

Generalization_Rule=Obj Descriptor ObjDescriptor 

Inf erence_Rule = Head Tail] . 

Tail = Goal Goal}. 

Head = SimpleHead (SimpleHead | Variable)] 

SimpleHead = (Term | CG) . 

Goal = SimpleGoal SimpleGoal] . 

SimpleGoal = Term | CG | Variable . 
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Ob j Descriptor = Term . 

Term = Identifier [ "(" PrlgCGData PrlgCGData} 

" ) " ] . 

PrlgCGData = Number | Boolean | Identifier | String | 
Variable | List | Term | CG . 

List = "(" [ PrlgCGData PrlgCGData} [ "|" 

Variable] ] 

CG = Concept [OutBranch | InBranch | Branchs] . 

Branchs = (OutBranch | InBranch) 

(OutBranch | InBranch) } [";"] 

OutBranch = Relationidentif ier cG . 

InBranch = Relationidentif ier CG . 

Concept = "[" Type Referent] ["=" Value] "] " - 

Type = Typeldentif ier | Variable . 

Referent = Ref erentidentif ier | Multi_Ref erent | 

Variable . 

Value = PrlgCGData . 

Comment = "//" {Character} . 

Typeldentif ier , Ref erentidentif ier , 

Relationidentif ier = Identifier . 
Multi_Ref erent = {Digit} . 

Number = Digit { Digit } . 

Boolean = "true" | "false" . 

Identifier = Letter Letter {Letter | Digit | } . 

String = """ { Character-other-than" } """ . 

Variable = ( { Letter | Digit | }) | Letter | 

(Letter (Digit | ) { Letter | Digit | }). 

Appendix 2 : The Environment of PROLOG+CG 

The integrated environment of PROLOG+CG consists of a text editor, a “compiler” 
and the interpreter. The compiler performs a syntactic analysis of the program and if 
the analysis is successful, it generates an object file that contains an internal 
representation of the program in terms of objects and Java structures (vector, 
hashtable, etc.). The interpreter works on the object file. 

The environment provides three fixed windows (Figure 1) : a) Program window 
which enables the programmer to edit his program, b) Console window which enables 
the programmer to state and edit his request and to get responses from the system, and 
c) Debug window which is used by the compiler to inform the programmer about the 
result of the syntactic analysis. 

The environment provides also a window that shows the primitive operations of the 
language and a window that presents the PROLOG+CG manual. Figure 1 gives a 
snapshot of the PROLOG+CG environment. 
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Prolog+CG File - C;'JBuiider2inyprojectsiProloglNSEAV[i]: 



Person > Student, Employee. 
Student > ResearchAssistanl 
Employee > ResearchAssistant. 
Inanimate > MailBox, Text, 
Text > Book, 



ResearchAssistant::[ResearchAssistant] -has->[MailBox] . 

Extracl(canon);:[Extract] - 

-agnt->[Person], 

-ob]->[Inanimate], 
Extract(schema)::[Extiact] • 

■agnt->[Person], 

•obj->[Text], 

-target->[Book], 
Extracl(schema)::[Extract] ■ 

■agnt->[Person], 

-obj->[Inanimate : *1], 
■manr->[Strong], 

-target->[Inanimate]-on->[Inanimate: ‘ 



II checkSchemas(Exlract)::[Exlract] -target->[Inanimate] , 

|checkSchemas(T)::G :■ 

term__list(v_term, (T, schema)), 
vjerm::v_schema, 



2e! 






Welcome to PROLOG+CG 
Beta Version 1.0, December 19P9 
(c)Dr. AdilKABBAJ 

I.RS.EA, RABAT, MOROCCO 

■ Extiact(schema)::S, 

'S = [Extract] ■ 

-agnt->[Person], 

-obj->[Text], 

-target->[Book]) 

[S = [Extract] ■ 

-agnt->[Person], 

-obj->[Inanimate]<-on-[Inanimate: *T 
-manr->[Strong]. 

-target->[Inanimate : *1]} 

■ checkSchemas(Exttact)::[Extract] -target->[x] , 
x = Book) 

'x = Inanimate) 

ir~ 






Loading the object file *”/ 

Please wait ... 

###### Loading operation is terminated with success. 
/**♦•••♦**** End of Building Task **♦**/ 



C;UBuilder2tmyproiectslProloglNSEAtExpleConcStrs.prlg is open 



1 



Fig. 1. PROLOG+CG environment 
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Abstract. In order to build descriptions of prototypical situations, we first 
developed a system, MLK (Memorization for Learning Knowledge), allowing 
us to gather events related to similar situations starting from descriptions found 
in texts, these events being represented by conceptual graphs. One of the stages 
to build these prototypes consists of generalizing some similar graphs in order 
to produce a description. In this paper, we present a cost hounded algorithm of 
conceptual graph generalization, proceeding by ascending clustering. The use 
of costs on the operations of generalization allows us to control the growth of 
the search space. 



1. Introduction 

Text understanding requires knowledge about concrete situations that are developed 
in order to recognize links between the different events. Such knowledge is often 
represented by schemas whose content describes characters and events involved in a 
situation. However handcoding schemas is a very difficult task, even on a limited 
domain. Automatic acquisition of situation descriptions has been [1], [2] and [3], but 
essentially to learn new specialized situations of predefined ones. Inside a framework 
that does not make any hypothesis on the existence of general knowledge, we have 
first conceived the system MLK (Memorization for Learning Knowledge) [4], [5], 
able to learn from the accumulation of its own experience. MLK memorizes each 
specific situation found in texts by aggregating it with an already memorized 
description if a similarity between their events is recognized. This process leads to 
incrementally build aggregated Thematic Units (TU) that are precursors of general 
schemas. Each event inside a TU is represented by a weighted conceptual graph (CG) 
[6]. In order to build a general description of a situation from a TU, we have studied 
how to generalize these events, i.e. conceptual graphs, to find a description level 
accounting for different formulations of a same event. This kind of problem is 
dependant on ascending clustering methods from positive examples. Generalization 
algorithms of conceptual graphs have already been proposed [7], [8], but they only 
generalize concepts, and not the graph itself. Generalizing both concepts and graphs 
entail a combinatory explosion when searching for all the possibilities. Therefore, we 
have developed a cost-bounded generalization algorithm that limits the search space 
and leads us to propose informative generalizations. A cost is associated with each 
generalization operator, and we take advantage of semantic knowledge to produce 
meaningful descriptions of events. 
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2. System Overview 

The system input is made of a set of conceptual graphs describing events related to a 
situation of the same kind, for example to come in a house, to attack a woman, to stab 
a man with a knife, to arrest the murderer, etc. In order to elaborate a description of 
the situation from these events, a main stage is to find a right level of description 
leading to generalize those specific graphs describing a same kind of event. In the 
preceding example, it would consist of proposing a description where the woman 
attack and the man stabbing are generalized, whereas the two other events are left as 
it. Then the problem can be formulated as follows: given a set of conceptual graphs, 
find among them those that can be generalized while keeping an informative 
description level. Our purpose is not to find the common generalization of all the 
events, but a description level keeping the specificity of the kinds of events 
represented by the graphs. Coming back to the example, our goal is not to generalize 
the four events in a single one that would be to carry out an action. 

The method we propose consists of developing a sub-set of the possible 
generalizations of each graph, by generalizing concepts and removing relations. 
Generalization is controlled by the use of semantic knowledge: a lattice of concept 
types and constraints on the arguments related to some concepts, given by canonical 
graphs. It avoids overgeneralizations that would not have any meaning. Developing 
all the generalizations of a graph is unconceivable, even if limiting them by 
comparison with domain knowledge. Therefore, in order to limit the size of the 
generalization space, a cost is associated to the generalization operators, allowing the 
system to associate a cost to each graph produced and to limit their formation by 
fixing a threshold. Costs are defined according to the task and encode the 
generalization level that is searched. All the generalizations are organized in 
configurations allowing the system to know which initial graphs are generalized. 
These configurations represent the different possibilities to describe the initial 
situation. The choice of one of them is done according to the number of initial graphs 
that are generalized and to the cost of the different configurations. Thus, the 
description level that is built is constrained both by the semantic knowledge and by 
the costs associated to the generalized graphs. This method allows us to bypass the 
difficulty coming from the impossibility to define negative examples in our task and 
then to use classical learning algorithms. 



3. Conceptual Graphs 

Semantic knowledge is represented in a lattice of types of concept and by canonical 
graphs associated to some types in order to precise their thematic roles (i.e. relations), 
as agent, object, etc., and semantic constraints on concepts that might fill them (see 
Fig. 1). We do not follow the Conceptual Graphs standard draft NCITS.T2/98-003. 
The "syntactic" knowledge held by star graphs is coded by our canonical graphs. 
Conceptual graphs that represent events (see Fig. 2) are obtained by application of 
formation rules to one or several canonical graphs, as described in Sowa [6]. So, these 
graphs are built with respect to the constraints defined in the knowledge base. A graph 
resulting from the application of these rules is a specialization of one (or several) 
initial graph(s). In Fig. 2, the graph results from a maximal joint between two graphs 
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produced after restrictions on types of concept in the canonical graphs associated to 
stab and body. See [9] for a detailed presentation of the formation rules and their 
relation with subsumption. The class of CGs we consider is the class of simple CGs 
derived from canonical graphs, with n-ary relations and with a distinguished concept 
node explained helow. 
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(agent) 
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[Body ] 


(dest.) 




[Body ] 


(instmment) 
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(instrument) 




[Knife] 


Fig. 1. A canonical graph 


Fig. 2. An event 





As our graphs represent events, we distinguish a concept that is significant for the 
event type, named predicate {Stab in Fig. 2), this concept playing a particular role 
inside our application. Other kinds of concept result from abstractions done during the 
aggregation process in MLK. They already generalize instances linked to a same 
predicate that are found in texts. See [4] and [5] to find details about the process that 
build these graphs. 



4. Graphs Generalization 

4.1 Method 

Some graphs belonging to the initial set may contain different predicates while being 
very similar, as in Fig. 3. If graphs are identical except for their predicate, and if these 
predicates are semantically close, these graphs have to be replaced by a graph that 
generalize them. The same principle applies if the graphs only differ by some details 
in the predicate arguments. The problem is then to find the least common 
generalization of several graphs, given one does not a priori know which graphs in 
the initial set have to be generalized. 





Fig. 3. Similar conceptual graphs with different stmctures and concepts 



Such a problem is similar to conceptual clustering, with conceptual graphs as the 
description language of the concepts to be learned [10]. Initial conceptual graphs are 
equivalent to first order logic formulas, made of a conjunction of positive predicates, 
the concepts and the relations of the graphs. This problem is close to the work of 
Mineau and Bournaud. Even if we do not search for a classification, which is the 
problem they have to solve, the generalization space has also to be generated in order 
to choose the generalization level we want to reach. In their work, a graph is 
represented by the set of its relations to avoid comparison of graphs, that is a NP- 
complete problem. However, this approach cannot be applied in our application given 
that the structure of the graphs may be modified during the generalization process 
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when removing relations. We have to reason on graphs themselves in order to keep 
specific links between a predicate and its arguments, i.e. to maintain their 
connectedness. It is then impossible to generate meaningful generalizations in a 
limited time without controlling the growth of the generalization space. The process 
we propose is a cost-bounded algorithm that associates a cost to each generalization 
operator and uses domain knowledge to avoid overgeneralizations. This algorithm has 
been conceived without taking into account the specificity of our application other 
than the presence of a central concept in conceptual graphs. So we consider that types 
of relations are organized in a lattice, even if our knowledge representation does not 
use this possibility. 

We have defined three primitive generalization operators: 

1. concept generalization: replacing a type of concept by its supertype in the lattice, 

2. relation generalization: replacing a type of relation by its supertype in the lattice, 

3. relation removing: removing a relation in the graph, and some associated concepts. 

We have not defined an operator for removing concepts because removing a 
concept entails the removal of all the relations it is linked to. If we come back to the 
example in Fig. 3, removing the concept Shop leads to remove the relation Loc, and 
conversely removing the relation Loc leads to remove the concept Shop in order to 
keep a connected graph (in such a case, we keep the connected component that 
contains the predicate). Removing relations is then sufficient since removing concepts 
is strictly included in the results of the removal of relations, which may entail the 
removal of several concepts and relations. 

These operators create a partial order between the resulting graphs, as defined in 
[6] and [9]. Precisely, the subsumption relation defined for conceptual graphs is the 
relation induced by the existence of a projection between two graphs [11]. 

We associate a cost to each operator since we dispose neither a domain theory 
(general knowledge about situations for example), nor a characterization of the pieces 
of knowledge we want to learn that would give us a formal proof of the quality of the 
generalization. We use an empirical evaluation based on semantics and the estimation 
of the loss of information when generalizing. 

In order to find this cost, we have reasoned on the effect of each operation in terms 
of loss of information it entails. In our context, the cost of each operator is related to 
the concept it applies to: predicate or not, concept derived from the canonical graph of 
the predicate or not. When generalizing a concept that is not derived from the 
canonical graph of the predicate, named common concept, the generalization is made 
on characteristics that are peripheral to the event. Removing relations, that, as already 
said, causes the removal of concepts, entails the suppression of some of these 
characteristics. As a matter of fact other concepts cannot be removed, as one cannot 
remove relations belonging to the canonical graph of the predicate without losing the 
meaning of the graph (the canonicity of the resulting graph is no more verified). 
Lastly, the operation we consider as very costly is the generalization of the predicate, 
since, as we have seen in part 2, we want to avoid many generalizations of the 
predicates in order to keep kinds of events specific to the described situation. So the 
least common generalization is a graph achieved by preferentially generalizing 
concepts other than the predicates. 

We have defined the following order of the costs of the operators, this ordering 
characterizing the loss of information: 
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generalization of a common concept < generalization of a concept derived 
from the canonical graph < removal of a relation < generalization of the 
predicate 

The relative order is more significant than absolnte valnes that defines costs. The 
value given to each operator is connected to the threshold that is fixed to compute 
generalizations for a given application, the number of operations it is likely to realize, 
and the importance of each operation in relation to the others. These values have to be 
fixed experimentally according to the application. 

Let us now present some specificity due to the conceptual graph formalism and our 
application. Firstly, it is not possible to suppress the predicate, since the resulting 
graph(s) would no longer refer to the described event. Secondly, if two non-connected 
graphs result from a suppression, we only keep the graph containing the predicate. 
Thirdly, a concept is never generalized in a concept more general than the one in the 
canonical graph. And last, when generalizing a predicate, a verification is done using 
the new canonical graph to ensure the canonicity of the resulting graph. A 
consequence of having this distinguished concept is that the complexity of the 
operations on CGs is reduced due to less possible matching between two graphs. But 
that does not reduce a lot the size of the effectively searched space. 



4.2 The Generalization Algorithm 

One stage of the algorithm consists in building the generalization space which 
contains, at the beginning, the conceptual graphs to be generalized, i. e. the root 
nodes. During the processing, each node resulting from a generalization is inserted in 
this space only if it does not already exist. A node of depth n is the result of n 
generalizations of a root node. For example, a conceptual graph that contains 3 
concepts and 2 relations may be processed by applying 5 generalizations of concept or 
relation and 2 removal of relation. It would generate, at most, 7 new nodes. The cost 
associated to each node, related to the path towards the root node, is equal to the cost 
of its father plus the cost of the operation which has given it birth. The cost of a root 
node is zero. Applying such an algorithm until a common generalization is found, if it 
exists, is exponential in the general case. Bounding the set of the possible 
generalizations by fixing a maximal cost on the nodes entails a quite reasonable 
practical complexity (this latter point will be detailed in the following section) and all 
the generalizations having a cost less than this threshold are computed. 

Fig. 4 shows an example where three conceptual graphs (the root nodes xl, x2 and 
x3) have been generalized in the graphs (a, b, c, d), (c, b, e) and (e, f, g, d) 
respectively. Values on the branches are the costs of the applied operators. 

Costs which stop the generalization process, more than maintaining the semantics 
of the graphs, yield the possibility to control the growth of the generalization space. In 
order to improve more effectively the performance of the algorithm, we have 
implemented some principles. For example, if the cost to find a supertype common to 
the predicate of a graph and the predicates of each other graph of the initial set 
oversteps the given threshold, this graph is removed from the initial set. By using this 
principle, the algorithm applies on graphs that may be a generalization of two root 
graphs, and not on the very initial graphs, given they all contain a different predicate 
(see Initialization part of the algorithm in Fig. 6). On another hand, graphs are 
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indexed by their predicate and their number of relations and concepts, such as 
comparison of graphs is only done if these characteristics are identical. 




Fig. 4. A generalization space built from 3 root graphs 

The aim of the algorithm is to generate all the possible descriptions of the initial set, 
the set of configurations. A configuration is a set of conceptual graphs, generalized or 
not, that represents a partition of the initial examples, the root graphs. After 
application of the algorithm, only the graphs that generalize several root graphs with 
the root graphs themselves are kept. Other nodes, that just generalize a unique initial 
conceptual graph, are useless for our purpose (for the example in Fig. 4, retained 
nodes appear in the first column in Fig. 5). The generalization cost of a node is the 
average cost of the generalization costs of this node, these latter costs correspond to 
the different path issued from the different root nodes. Finding the set of 
configurations consists of building all the subsets of nodes such as each initial node is 
generalized once and only once in the subset. Each configuration is evaluated by the 
average cost of its elements. 
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Configurations = {[c, x3] (2), [b, x3] (1), [d, x2] (3.5), [e, xl] (1.5), [xl, x2, x3] (0)} 
Fig. 5. Results of the generalization algorithm 

Configurations are computed during the generalization process (cf. Updating 
configurations in the algorithm). At each step, partial configurations, corresponding to 
the retained generalizations, are updated if the new graph generalizes at least two root 
graphs. 

If more than one configuration has been built at the end of the processing, we 
choose the one that verifies the following criteria, applied in this order: a) the 
configuration that corresponds to the generalization of the maximum number of root 
graphs; b) the configuration containing less numerous elements; and c) the 
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configuration having the minimal cost, other than zero, considering it is the most 
likely to be the most common specific description of the initial events. By means of 
these criteria, we encode that the best description of the initial situation corresponds 
to the maximal regrouping of graphs, whose cost is less than the limit. In the example 
of Fig. 4, these criteria lead to choose the configuration [b, x3] having an average cost 
equal to 1. These criteria fit our application, even if, for other kinds of applications, 
another order or other criteria may have to be found. 
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Fig. 6. The generalization algorithm 



The algorithm we propose is general enough to be used in other applications. We will 
see in the latter sections that heuristics coming from the application domain and 
expressed with simple numeric values, entail the considerable pruning of the search 
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space, allowing the algorithm to find only the more informative generalizations. As 
for all algorithms depending on heuristics, it is not easy to formally demonstrate the 
complexity of our algorithm, thus we have worked on a demonstration of the 
adequacy and the efficiency of this method by extensive tests. 



5. Results 

The process has been applied with success to the results of MLK, but this does not 
entail a validation on a quantity of data sufficiently large, because text representations 
that compose the input of MLK are handcoded at this moment. That is why, in order 
to test our algorithm in a better way, we conceived a test generator. Its knowledge 
(lattice of types and canonical graphs) is compatible with our application and the 
generator is parameterizable so that it can simulate different kinds of application. 



5.1 Domain Knowledge 

Our application exploits a lattice of types of concepts and a set of canonical graphs. 
The lattice (see Fig. 8), results from the work of Chibout [12] who created an 
ontology of approximately 3,000 concepts for verbs and entities. We extracted a sub- 
part from this ontology so that to obtain a lattice regular enough in depth and in 
breadth. Thus, 368 concepts were extracted and distributed in a lattice with an average 
depth of 5.59 for a branching factor of 2.75. We have selected 8 verbs from the 128 
verbs retained to assign canonical graphs to them. These canonical graphs are defined 
relative to the types retained in the lattice and to produce homogeneous bench tests, so 
that global results will have a signification. 



liquid 



Fig. 7. Canonical graph of the concept denoted by the verb "to drink" 



The 8 types are ToCauseDeath, ToCauseToFeel, ToDivide, ToBreak, ToFly, 
ToAbsorb, ToEat and ToDrink. Subtypes inherit of the canonical graph of their father. 
Figure 7 shows the canonical graph of ToDrink. The Function concept is the 
supertype of 55 concepts. Some of them are Artisan, Killer or Deputy. Liquid is the 
supertype of, among others. Water, Alcohol and Syrup 

The lattice of relation types is limited in our application to a flat lattice: all the 
relations have the same supertype (T) and the same subtype (L). For our tests, we 
have used four relations: Relation, Agent, Object and Place. 



5.2 Bench Test Generation 

We have generated sets of graphs with controllable properties for the generalization. 
By specializing canonical graphs in directions randomly chosen and in different 
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depths, our task is then to retrieve the original common graphs, or, better, some 
specialization of them that might not have appeared during the generation phase. So, 
our generator builds sets of graphs such as each set possesses a partition of its graphs 
where each part (named family) is made of graphs that are specialized from a graph 
more specific than their common canonical graph. This definition of the searched 
generalizations comes from our application for which it is not suitable to generalize 
events by their canonical graph: it does not fit an informative description. 
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Fig. 8. Extracts from the concept ontology 



To produce a bench test, the generator randomly chooses n predicates (PI and P2 in 
Fig. 9), each one being the origin of one family. The canonical graph of each 
predicate is then specialized several times (si times in family 1 and s2 times in family 
2) to generate n graphs, named ancestor graphs. Each ancestor graph is differently 
specialized (kl, k2, ..., k5 times in family 1 and kl, k2 and k3 times in family 2), 
creating different specializations named cousin graphs. 

Input data of the generalization process are the set of all the cousin graphs of 
different families (the 8 cousin graphs), the goal being to retrieve the set made of the 
ancestor graphs. The numbers of specialization operations applied (restriction of type 
belonging to the canonical graph or not, addition of a relation) are set at the time of 
the bench test specification. 
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Fig. 9. Bench test generation framework 

We can see below an example of test containing four graphs with two families of 
two graphs each. The graphs have been obtained following the parameters presented 
in Fig. 10. 



Graph 1: 

[ToNibble] { 

— (Agent) -> [Nurse] 

— (Object) -> [Cake] 

— (Rel) -> [Apple] } 

Graph 2: 

[ToDevour] { 

— (Agent) -> [Nurse] 

— (Object) -> [Meat] 

— (Rel) -> [Rice] } 



Graph 3: 

[ToKill] { 

— (Object) -> [Reptile] 

— (Agent) -> [Assassin] 

— (Rel) -> [Car] } 

Graph 4: 

[ToEmbed] { 

— (Agent) -> [Killer] , 

— (Objet) -> [Insect] , 

— (Rel) -> [Water] 
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Fig. 10. Parameters for generating a set of graphs 



This example entails the construction of 6 configurations in a total time of 115s. We 
can see below the optimal solution obtained in 15 seconds and a non-optimal solution. 
Costs of operations are specified in section 5.4. 
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Optimal Solution: 

[ToEat] { 

— (Agent) -> [Nurse] , 

— (Object) -> [Food] } 
Cost: 8.0 
[ToCauseDeath] { 

— (Agent) -> [Criminal] , 

— (Objet) -> [Animal] } 
Cost: 10.0 



Non-optimal Solution: 

[ToEat] { 

— (Agent) -> [MedicalProf ession] , 

— (Object) -> [Aliment] } 

Cost: 10.0 
[ToCauseDeath] { 

— (Agent) -> [Criminal] , 

— (Objet) -> [Animal] } 

Cost: 10.0 



If we look again at the definition of a good solution given section 4.2, the optimal 
solution in this example respects the established criteria: maximum of root graphs 
generalized, minimum number of graphs in the configuration and minimal cost in the 
case of configuration equality. In the non-optimal configuration, the concept Nurse in 
the two graphs of the first family has been overgeneralized in MedicalProfession. So, 
we really have obtained the best solution for a cost inferior to those of the other 
solution. The four other solutions, not presented here, have inferior costs but do not 
generalize all the graphs. 

We will now describe two bench tests. The first one tests a difficult case with very 
specialized graphs and little generalization costs. It entails the generation of a great 
number of configurations and high computation times. The second one shows the 
efficiency of the system with higher costs. 



5.3 A Borderline Case 

We produced 33 sets of 4 graphs containing 3 to 5 concepts and belonging to 2 
families. These graphs were produced by applying 3 to 8 specialization operations on 
canonical graphs. The threshold for the generalization was set to 10 and a limit of 
processing time was fixed to 1 hour by case. For this data set, we set low values to 
costs of operations in order to test the borderline case: the costs allow the system to 
compute all the ancestor graphs. These costs are the following: 

1. any concept generalization: 1, 

2. generalization of one concept of the canonical graph: 1, 

3. predicate generalization: 2, 

4. relation suppression: 1. 

The generalization process finds 32.15 configurations on average by bench test, in 
an average time of 28mn, 30% of the generalization processes reaching the time limit. 
The 70% solved cases are solved in an average time of 15mn. 



5.4 Costs Utilization 

To show the efficiency of mindfully chosen costs, we carried out the same kind of 
tests, but with costs multiplied by 2. We generated 60 bench test with 4 graphs each. 
With a time limit set to 1 hour, we obtained 12,2 configurations on average by case. 
With 20mn., we obtained 11.9. So, 97.5% of the configurations obtainable in 1 hour 
were obtained in less than 20mn. The following results were obtained in this time 
limit: 96.7% of the bench tests lead to at least one solution and 73,3% of the cases did 
not reach the time limit. 
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Fig. 11. Percentage of computed configurations relatively to the processing time 

Figure 11 shows the percentage of configurations obtained for all bench tests in 
function of the time. It shows that in a reasonable time of 7mn., 90% of the solutions 
obtainable in 20mn (92.3% of those obtainable in 1 hour) are already obtained. Thus, 
according to task requirements, if it is possible to miss some solutions, then one can 
parameterize the generalization system with processing times really acceptable. 

For these tests, the relevant solutions (i.e. informative) are thus those that contain 
generalized graph(s) equal to the ancestor graphs from which the graphs of the tests 
are issued. This relevant solution can only be found if, for each graph, the sum of the 
costs of the specialization operations applied to them is lower or equal to the 
generalization limit, here 10. The number of bench tests in this case is 22 (37%). With 
a processing time set to one hour, the relevant solution is found for all these cases and 
in 20mn., 20 solutions are found (90.9%). In the other 63%, for which the theoretical 
cost of the optimum solution is higher than the limit, we find nevertheless 15.8% of 
configurations in which the number of generalized graphs is the same as the number 
of families. This can be explained by the fact that cousin graphs are randomly 
generated. If by chance the specialization process goes the same way for some graphs, 
then the effective relevant solution is more specific than the ancestor graph. 



5.5 Analysis 

It has to be noted that in the first series of tests, the obtainable relevant solutions are 
obtained in 70% of the cases (23/33). In the second series, 100% of the obtainable 
solutions are found in one hour, however it contains a least number or test cases. So, 
with low costs, a great number of solutions is found and we also miss some good 
solutions. On the contrary, with high costs, the number of possible solutions is lower 
and the system finds all of them. When we will apply the system on real data, we will 
have to choose costs according to these two extremes. 

These tests show that a cost-bounded algorithm allows us to generalize conceptual 
graphs in a reasonable time, in tasks where costs on the generalization operators are 
definable. Differences between the two series of tests show the benefits of modulating 
the costs and the limit threshold in order to produce interesting generalizations and to 
avoid the production of graphs corresponding to overgeneralizations in the 
application, i.e. valid graphs with a too general description level. 

It is not an easy task to show the validity of solutions obtained with unsupervised 
learning. Thus, we have conceived our test protocol in such a way that the searched 
solution can be characterized. In our application framework, the validation could be 




A Cost-Bounded Algorithm to Control Events Generalization 567 



done externally by an expert who would answer the following question; "Does the 
produced solution seem coherent to you: are events described at a correct description 
level compared to the described situation?". In case of non- satisfaction, it is possible 
to execute the process again to search for new generalizations. Another possibility 
would be to validate a system using the learned knowledge. 



6. Previous Works 

Mineau [7] and Bournaud [8] have conceived algorithms to build a classification from 
an initial set of conceptual graphs. Even if we do not want to classify data, finding a 
generalization level can be seen as an equivalent problem as it requires the algorithm 
to compute possible generalizations. The major difference comes from the 
generalization operators used. Suppression of relations entails the necessity of a 
complete description of each generalized graph and not only a partial description by 
their relations. To limit the effective complexity, we have introduced costs associated 
with operators. These costs have to be fixed relative to the type of generalizations 
expected in the application. As in [8], we also use domain knowledge to constrain 
generalization. 

Our work comes under Inductive Logic Programming (ILP) domain [13] and our 
problem is particularly close to the work of Esra Erdem and Pierre Elener [14], who 
redefine the minimal generalization in order to find a minimal set of generalized 
clauses in function of an over-generalization criterion. In the field of ILP and 
conceptual graphs, our kind of problem is studied in the work of Marc Champesne on 
the reduction of the search space by the use of the notion of "empirical subsumption" 
[11]. However his results can only be applied to non-connected graphs whereas our 
algorithm handles connected ones. 

About our application itself, i.e. learning descriptions of prototypical situations, our 
work is different of [1], [2] and [3] essentially because we do not hypothesize the 
existence of previous knowledge about situations and learning is completely 
unsupervised. The system MLK, which implements this application, gathers and 
selects events and their relevant characteristics when recurrent situations are found in 
texts, and the system proposed in this paper assumes the construction of a general 
description with respect to semantic knowledge. 



7. Conclusion 

Learning structures in order to describe concrete situations has lead us to construct the 
system MLK, able to identify events linked to a same situation. In the purpose of 
generating general descriptions for these situations, we have studied the 
generalization of events represented by conceptual graphs. However the 
generalization algorithm we propose is independent enough of our particular context 
and could be used with other applications. 

This algorithm is completely implemented in Smalltalk. The tests made show the 
benefit of fixing costs on generalization operators in applications having only a weak 
domain theory. 
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The protocol of bench tests generation we have developed allows us to control 
different parameters such as the definition of an optimal solution, the number of 
operations to be applied in order to find this solution and the homogeneity of the 
bench tests. It presents two major advantages: testing the efficiency of our approach 
and allowing us to find values for the diverse parameters of the generalization 
algorithm by successive tests. 

At this time, we are working on the automatic acquisition of ontologies from 
partial syntactic analysis of phrases. This work will allow us, in a middle-term 
prospect, to apply the described generalization process to a great amount of data very 
similar to Thematic Units. In a more long-term prospect, this should entail the 
application of the complete MLK system on data automatically extracted from large 
volumes of texts. 
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